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TADG-15: AN EXTRACELLULAR SERINE PROTEASE 
OVEREXPRESSED IN BREAST AND OVARIAN CARCINOMAS 



10 

BACKGROUND OF THE INVENTION 

15 Field of the Invention 

The present invention relates generally to the fields of 
cellular biology and the diagnosis of neoplastic disease. More 
specifically, the present invention relates to an extracellular serine 
protease termed Tumor Antigen Derived Gene- 15 (TADG-15), which is 

20 overexpressed in breast and ovarian carcinomas. 

Description of the Related Art 

Extracellular proteases have been directly associated with 
tumor growth, shedding of tumor cells and invasion of target organs. 
25 Individual classes of proteases are involved in, but not limited to ( 1 ) 
the digestion of stroma surrounding the initial tumor area, (2) the 
digestion of the cellular adhesion, molecules to allow dissociation of 
tumor cells; and (3) the invasion of the basement membrane for 
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metastatic growth and the activation of both tumor growth factors 
and angiogenic factors. 

Tfie prior art is deficient in the lack of effective means of 
screening to identify proteases overexpressed in carcinoma. The 
5 present invention fulfills this longstanding need and desire in the art. 

SUMMARY OF THE INVENTION 

10 The present invention discloses a screening program to 

identify proteases overexpressed in carcinoma by examining PCR 
products amplified using differential display in early stage tumors, 
metastatic tumors compared to that of normal tissues. 

In one embodiment of the present invention, there is 

15 provided a DNA encoding a TADG-15 protein selected from the group 
consisting of: (a) isolated DNA which encodes a TADG-15 protein; (b) 
isolated DNA which hybridizes to isolated DNA of (a) above and which 
encodes a TADG-15 protein; and (c) isolated DNA differing from the 
isolated DNAs of (a) and (b) above in codon sequence due to the 

20 degeneracy of the genetic code, and which encodes a TADG-15 
protein. 

In another embodiment of the present invention, there is 
provided a vector capable of expressing the DNA of the present 
invention adapted for expression in a recombinant cell and regulatory 
25 elements necessary for expression of the DNA in the cell. 

In yet another embodiment of the present invention, 
there is provided a host cell transfected with the vector of the present 
invention, the vector expressing a TADG-15 protein. 
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In still yet another embodiment of the present invention, 
there is provided a method of detecting expression of a TADG-15 
mRNA, comprising the steps of: (a) contacting mRNA obtained from 
the cell with the labeled hybridization probe; and (b) detecting 
5 hybridization of the probe with the mRNA. 

Other and further, aspects, features, and advantages of the 
present invention will be apparent from the following description of 
the presently preferred embodiments of the invention given for the 
purpose of disclosure. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 

So that the matter in which the above-recited features, 
advantages and objects of the invention, as well as others which will 

15 become clear, are attained and can be understood in detail, more 
particular descriptions of the invention briefly summarized above 
may be had by reference to certain embodiments thereof which are 
illustrated in the appended drawings. These drawings form a part of 
the specification. It is to be noted, however, that the appended 

20 drawings illustrate preferred embodiments of the invention and 
therefore are not to be considered limiting in their scope. 

Figure 1 shows a comparison of PCR products derived 
from normal and breast carcinoma cDNA as shown by staining in an 
agarose gel. 

25 Figure 2 shows a comparison of the serine protease 

catalytic domain of TADG-15 (SEQ ID No: 14) with hepsin (Heps, SEQ 
ED No: 3), (Scce, SEQ ID No: 4), trypsin (Try, SEQ ID No: 5), 
chymotrypsin (Chymb, SEQ ID No: 6), factor 7 (Fac7, SEQ ID No: 
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7) and tissue plasminogen activator (Tpa, SEQ ID No: 8). The asterisks 
indicate conserved amino acids of catalytic triad. 

Figure 3 shows quantitative PCR analysis of TADG-15 

expression. 

5 Figure 4 shows the ratio of TADG-15 expression to 

expression of p-tubulin in normal tissues, low malignant potential 

tumors (LMP) and carcinomas. 

Figure 5 shows the TADG-15 expression in tumor cell 
lines derived from both ovarian and breast carcinoma tissues. 
10 Figure 6 shows the overexpression of TADG-15 in other 

tumor tissues. 

Figure 7 shows the Northern blots of TADG-15 expression 
in ovarian carcinomas, fetal and normal adult tissues. 

Figure 8 shows a diagram of the TADG-15 transcript and 
15 the clones with the origin of their derivation. 

Figure 9 shows nucleotide sequence of the TADG-15 
cDNA (SEQ ID No: 1) and amino acid sequence of the TADG-15 protein 
(SEQ ID No: 2) 

Figure 1 0 shows the amino acid sequence of the TADG- 
20 15 protease including functional sites and domains. 

Figure 11 shows a structure diagram of the TADG-15 
protein including functional domains. 

Figure 12 shows a nucleotide sequence comparison 
between TADG-15 and human SNC-19 (GeneBank accession #U20428). 

25 



DETAILED DESCRIPTION OF THE INVENTION 
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As used herein, the term "cDNA" shall refer to the DNA 
copy of the mRNA transcript of a gene. 

As used herein, the term "derived amino acid sequence" 
shall mean the amino acid sequence determined by reading the triplet 
5 sequence of nucleotide bases in the cDNA. 

As used herein the term "screening a library" shall refer 
to the nrocess of using a labeled probe to check whether, under the 
appropriate conditions, there is a sequence complementary to the 
probe present in a particular DNA library. In addition, "screening a 
10 library" could be performed by PCR. 

As used herein, the term "PCR" refers to the polymerase 

* 

chain reaction that is the subject of U.S. Patent Nos. 4,683,195 and 
4,683,202 to Mullis, as well as other improvements now known in the 
art. 

15 The TADG-15 cDNA is 3147 base pairs long (SEQ ED No:l) 

and encoding for a 855 amino acid protein (SEQ ID No:2). The 
availability of the TADG-15 gene opens the way for a number studies 

« 

that can lead to various applications. For example, the TADG-15 gene 
can be used as a diagnostic or therapeutic target in ovarian carcinoma 

20 and other carcinomas including breast, prostate, lung and colon. 

In accordance with the present invention there may b e 
employed conventional molecular biology, microbiology, and 
recombinant DNA techniques within the skill of the art. Such 
techniques are explained fully in the literature. See, e.g., Maniatis, 

25 Fritsch & Sambrook, "Molecular Cloning: A Laboratory Manual (1982); 
"DNA Cloning: A Practical Approach," Volumes I and II (D.N. Glover ed. 
1985); "Oligonucleotide .Synthesis" (M.J. Gait ed. 1984); "Nucleic Acid 
Hybridization" [BX>. Hames & S.J. Higgins eds. (1985)]; 
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"Transcription and Translation" [B.D. Hames & S.J. Higgins eds. (1984)]; 
"Animal Cell Culture" [R.L Freshney, ed. (1986)]; "Immobilized Cells 
And Enzymes" [IRL Press, (1986)]; B. Perbal, "A Practical Guide To 

Molecular Cloning" (1984). 

5 Therefore, if appearing herein, the following terms shall 

have the definitions set out below. 

The amino acid described herein are preferred to be in the 
"L" isomeric form. However, residues in the "D" isomeric form can b e 
substituted for any L-amino acid residue, as long as the desired 
10 functional property of immunoglobulin-binding is retained by the 
polypeptide. NH2 refers to the free amino group present at the amino 

terminus of a polypeptide. OOOH refers to the free carboxy group 
present at the carboxy terminus of a polypeptide. In keeping with 
standard polypeptide nomenclature, / Biol Chem., 243:3552-59 
15 (1969), abbreviations for amino acid residues are shown in the 
following Table of Correspondence: 
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SYMBOL 
1 -Letter 

Y 

G 

F 

M 

A 

S 

I 

L 

T 

V 

P 

K 
H 

Q 

E 

W 

R 

D 

N . 

C 
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TART F OF craPPSPONDENCE 



•^-Letter 
Tyr 

Gly 
Phe 
Met 
Ala 
Ser 
lie 
Leu 
Thr 
. Val 
Pro 
Lys 
His 
Gin 
Glu 
Trp . 
Arg 
Asp 
Asn 
Cys 



AMINO ACID 

tyrosine 
glycine 

Phenylalanine 

methionine 

alanine 

serine 

isoleucine 

leucine 

threonine 

valine 

proline 

lysine 

histidine 
alutamine 
glutamic acid 
tryptophan 
arginine 
aspartic acid 
asparagine 
cysteine 



35 



It should be noted that all amino-acid residue sequences 
are represented herein by formulae whose left and right orientation 
is in the conventional direction of amino-terminus to carboxy- 
terminus. Furthermore, it should be noted that a dash at the 
beginning or end of an amino acid residue sequence indites a 
peptide bond to a further sequence of one or more ammo-acid 
residues. The above . Table is presented to correlate the three-letter 
and one-letter notations which may appear alternately herein. 

A "replicon" is any genetic element (e.g., plasmid, 

v ,u„. fnnrrinns as an autonomous unit of DNA 
chromosome, virus) that tunctions as an 

,. . • . ; ,» ranable of replication under its own 

replication in vivo; i.e., capaoie ui y 
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control. 

A "vector" is a replicon, such as plasmid, phage or cosmid, 
to which another DNA segment may be attached so as to bring about 
the replication of the attached segment. 

5 . a "DNA molecule" refers to the polymeric form of 

deoxyribomicleotides (adenine, guanine, thymine, or cytosine) in its 
either single stranded form, or a double-stranded helix. This term 
refers only to the primary and secondary structure of the molecule, 
and does not limit it to any particular tertiary forms. Thus, this term 
10 includes double-stranded DNA found, inter alia, in linear DNA 
molecules (e.g., restriction fragments), viruses, plasmids, and 
chromosomes. In discussing the structure herein according to the 
normal convention of giving only the sequence in the 5' to 3' direction 
along the nontranscribed strand of DNA (i.e., the strand having a 

15 sequence homologous to the mRNA). 

An "origin of replication" refers to those DNA sequences 

that participate in DNA synthesis. 

* . 

A DNA "coding sequence" is a double-stranded DNA 
sequence which is transcribed and translated into a polypeptide in 

20 vivo when placed under the control of appropriate regulatory 
sequences. The boundaries of the coding sequence are determined by 
a start codon at the 5 r (amino) terminus and a translation stop codon 
at the 3" (carboxyl) terminus. A coding sequence can include, but is 
not limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, 

25 genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and 
even ' synthetic DNA sequences. A polyadenylation signal and 
transcription termination sequence will usually be located 3* to the 
coding sequence. 
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Transcriptional and translational control sequences are 
DNA regulatory sequences, such as promoters, enhancers, 
polyadenylation signals, terminators, and the like, that provide for the 
expression of a coding sequence in a host cell. 
5 "~ "~ ' a "promoter sequence" is a DNA regulatory region capable 

of binding RNA polymerase in a cell and initiating transcription of a 
downstream (3 1 direction) coding sequence. For purposes of defining 
the present invention, the promoter sequence is bounded at its 3* 
terminus by the transcription initiation site and extends, upstream (5' 
10 direction) to include the minimum number of bases or elements 
necessary to initiate transcription at levels detectable above 
background. Within the promoter sequence will be found a- 
transcription initiation site, as well as protein binding domains 
(consensus sequences) responsible for the binding of RNA polymerase. 
15 Eukaryotic promoters often, but not always, contain "TATA" boxes and 
"CAT" boxes. Prokaryotic promoters contain Shine-Dalgarno 
sequences in addition to the -10 and -35 consensus sequences. 

An "expression control sequence" is a DNA sequence that 
controls and regulates the transcription and translation of another 
20 DNA sequence. A coding sequence is "under the control" of 
transcriptional and translational control sequences in a cell when RNA 
polymerase transcribes the coding sequence into mRNA, which is then 
translated into the protein encoded by the coding sequence. 

A "signal sequence" can be included near the coding 
25 sequence. This sequence encodes a signal peptide, N-terminal to the 
polypeptide,, that communicates to the host cell to direct the 
polypeptide to the cell surface or secrete the polypeptide into the 
media, and this signal peptide is clipped off by the host cell 
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before the protein leaves the cell. Signal sequences can be found 
associated with a variety of proteins native to prokaryotes and 
eukaryotes. 

The term "oligonucleotide", as used herein in referring to 
the probe of the present invention, is defined as a molecule comprised 
of two or more ribonucleotides, preferably more than three. Its exact 
size will depend upon many factors which, in turn, depend upon the 
ultimate function and use of the oligonucleotide. 

The term "primer" as used herein refers to an 
oligonucleotide, whether occurring naturally as in a purified 
restriction digest or produced synthetically, which is capable of acting 
as a point of initiation of synthesis when placed under conditions in 
which synthesis of a primer extension product, which is 
complementary to a nucleic acid strand, is induced, i.e., in the 
presence of nucleotides and an inducing agent such as a DNA 
polymerase and at a suitable temperature and pH. The primer may 
be either single-stranded or double-stranded and must be sufficiently 
long to prime the synthesis of the desired extension product in the 
presence of the inducing agent. The exact length of the primer will 
depend upon many factors, including temperature, source of primer 
and use the method. For example, for diagnostic applications, 
depending on the complexity of the target sequence, the 
oligonucleotide primer typically contains 15-25 or more nucleotides, 
although it may contain fewer nucleotides. 

The primers herein are selected to be '•substantially- 
complementary to different strands of a particular target DNA 
sequence! This means that the primers must be sufficiently 
complementary to hybridize with their respective strands. 
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Therefore, the primer sequence need not reflect the exact sequence of 
the template. For example, a non-complementary nucleotide 
fragment may be attached to the 5" end of the primer, with the 
remainder of the primer sequence being complementary to the 
strand. Alternatively, non-complementary bases or longer sequences 
can be interspersed into the primer, provided that the primer 
sequence has sufficient complementary with the sequence or 
hybridize therewith and thereby form the template for the synthesis 

of the extension product. 

As used herein, the terms "restriction endonucleases" and 
"restriction enzymes" refer to enzymes, each of which cut double- 
stranded DNA at or near a specific nucleotide sequence. 

A cell has been "transformed" by exogenous or 
heterologous DNA when such DNA has been introduced inside the cell. 
The transforming DNA may or may not be integrated (covalently 
linked) into the genome of the cell. In prokaryotes, yeast, and 
mammalian cells for example, the transforming DNA may tie 
maintained on an episomal element such as a plasmid. With respect 
to eukaryotic cells, a stably transformed cell is one in which the 
transforming DNA has become integrated into a chromosome so that it 
is inherited by daughter cells through chromosome replication. This 
stability is demonstrated by the ability of the eukaryotic cell to 
establish cell lines or clones comprised of a population of daughter 
cells containing the transforming DNA. A "clone" is a population of 
cells derived from a single cell or ancestor by mitosis. A "cell line" is 
a clone of a primary cell that is capable of stable growth in vitro for 

many generations.- 

Two DNA sequences are "substantially homologous" 
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when a, least about 75% (preferably a. least about 80%. and most 
preferably at leas, about 90% or 95%) of the nucleotides match over 
the defined length of the DMA sequences. Sequences that are 
substantially homologous can be identified by comparing the 
5 - -sequences using standard software available in sequence data banks, 
or in a Southern hybridization experiment under, for example, 
stringent conditions as defined for that particular system. Defining 
appropriate hybridization conditions is within the skill of the art, See, 
e.g., Maniatis et al„ supra: DNA Cloning. Vo!s. 1 & II, supra: Nucletc 

10 Acid Hybridization, supra. 

A "heterologous' region of the DNA construct is an 
identifiable segment of DNA within a larger DNA molecule that is not 
found in association with the larger molecule in nature. Thus, when 
the heterologous region encodes a mammalian gene, the gene w.ll 
,5 usually be flanked by DNA that does not flank the mammahan 
genomic DNA in the genome of the source organism. In another 
example, coding sequence is a construct where the coding sequence 
itself is not found in nature (e.g., a cDNA where the genomic codtng 
sequence contains introns, or synthetic sequences having codons 
20 different than the native gene). Allelic variations or naturally- 
occurring mutational events do not give rise to a heterologous region 

of DNA as defined herein. 

The labels most commonly employed for these studies are 
radioactive elements, enzymes, chemicals which fluoresce when 
25 exposed to ultraviolet light, and others! A number of fluorescent 
materials are known and can be utilized as labels. These include, for 
example, fluorescein, rhodamine, auramine, Texas Red, AMCA blue 
and Lucifer Yellow. A particular detecting material is anti-rabbit 
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antibody prepared in goats and conjugated with fluorescein through 

an isothiocyanate. 

Proteins can also be labeled with a radioactive element or 

with an enzyme. The radioactive label can be detected by any of the 
5 currently available counting procedures. The preferred isotope may 

be selected from 3 H , "C, "P, 35 S, 36Q, 5i Cr , 57 Co , ™Co, 5*Fe, *°Y, i25i, 

1311, and i86R e . 

Enzyme labels are likewise useful, and can be detected by 

any of the presently utilized colorimetric, spectrophotometry, 
10 fluorospectrophotometric, amperometric or gasometric techniques. 

The enzyme is conjugated to the selected particle by reaction with 

bridging molecules such as carbodiimides, diisocyanates, 

glutaraldehyde and the like. Many enzymes which can be used in 

these procedures are known and can be utilized. The preferred are 
15 peroxidase, p-glucuronidase, p-D-glucosidase, p-D-galactosidase, 

urease, glucose oxidase plus peroxidase and alkaline phosphatase. 

U.S. Patent Nos. 3,654,090, 3,850,752, and 4,016,043 are referred to 

by way of example for their disclosure of alternate labeling material 

and methods. 

20 A particular assay system developed and utilized in the 

art is known as a receptor assay. In a receptor assay, the material to 
be assayed is appropriately labeled and then certain cellular test 
colonies are inoculated with a quantitiy of both the label after which 
binding studies are conducted to determine the extent to. which the 

25 labeled material binds to the cell receptors. In this way, differences 
in affinity between materials can be ascertained. 

An assay useful in the art is known as a "cis/trans" assay. 
Briefly, this assay employs two genetic constructs, one of which 
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is typically a plasmid that continually expresses a particular receptor 
of interest when transfected into an appropriate cell line, and the 
second of which is a plasmid that expresses a reporter such as 
luciferase, under the control of a receptor/ligand complex. Thus, for 
example, if it is desired to evaluate a compound as a ligand for a 
particular receptor, one of the plasmids would be a construct that 
results in expression of the receptor in the chosen cell line, while the 
second plasmid would possess a promoter linked to the luciferase 
gene in which the response element to the particular receptor is 
inserted. If the compound under test is an agonist for the receptor, 
the ligand will complex with the receptor, and the resulting complex 
will bind the response element and initiate transcription of the 
luciferase gene. The resulting chemiluminescence is then measured 
photometrically, and dose response curves are obtained and 
compared to those of known ligands. The foregoing protocol is 
described in detail in U.S. Patent No. 4,981,784. 

As used herein, the term "host" is meant to include not 
only prokaryotes but also eukaryotes such as yeast, plant and animal 
cells. A recombinant DNA molecule or gene which encodes a human 
TADG-15 protein of the present invention can be used to transform .a 
host using any of the techniques commonly known to those of 
ordinary skill in the art. Especially preferred is the use of a vector 
containing coding sequences for the gene which encodes a human 
TADG-15 protein of the present invention for purposes of prokaryote 
transformation. Prokaryotic hosts may include E. coli, 5. 

tymphimurium, Serratia marcescens and Bacillus subtilis. Eukaryotic 
hosts include yeasts such as Pichia pastoris, mammalian cells and 
insect cells. 
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In general, expression vectors containing promoter 
sequences which facilitate the efficient transcription of the inserted 
DNA fragment are used in connection with the host. The expression 
vector typically contains an origin of replication, promoter(s), 
5 terminator(s), as well as specific genes which are capable of providing 
phenotypic selection in transformed cells. The transformed hosts can 
be fermented and cultured according to means known in the art to 
achieve optimal cell growth. 

The invention includes a substantially pure DNA encoding 

10 a TADG-15 protein, a strand of which DNA will hybridize at high 
stringency to a probe containing a sequence . of at least 1 5 
consecutive nucleotides of (SEQ ID NO:l). The protein encoded by the 
DNA of this invention may share at least 80% sequence identity 
(preferably 85%, more preferably 90%, and most preferably 95%) 

15 with the amino acids listed in Figure 10 (SEQ ID NO:2). More 
preferably, the DNA includes the coding sequence of the nucleotides 
of Figure 9 (SEQ ID NO:l), or a degenerate variant of such a sequence. 

The probe to which the DNA of the invention hybridizes 
preferably consists of a sequence of at least 20 consecutive 

20 nucleotides, more preferably 40 nucleotides, even more preferably 
50 nucleotides, and most preferably 100 nucleotides or more (up to 
100%) of the coding sequence of the nucleotides listed in Figure 9 
(SEQ ID NO: 1) or the complement thereof. Such a probe is useful for 
detecting expression of TADG-15 in a human cell by a method 

25 including the steps of (a) contacting mRNA obtained from the cell 
with the labeled hybridization probe; and (b) detecting hybridization 
of the probe with the mRNA. . 

This invention also includes a substantially pure 
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DNA containing a sequence of at least 15 consecutive nucleotides 
(preferably 20, more preferably 30, even more preferably 50, and 
most preferably all) of the region from nucleotides 1 to 3147 of the 
nucleotides listed in Figure 9 (SEQ ID NO:l). 
5- By "high stringency" is meant DNA hybridization and 

wash conditions characterized by high temperature and low salt 
concentration, e.g., wash conditions of 65°C at a salt concentration of 
approximately 0.1 x SSC, or the functional equivalent thereof. For 
example, high stringency, conditions may include hybridization at 

10 about 42°C in the presence of about 50% formamide; a first wash at 
about 65°C with about 2 x SSC containing 1% SDS; followed by a 
second wash at about 65°C with about 0.1 x SSC. 

By "substantially pure DNA" is meant DNA that is not part 
of a milieu in which the DNA naturally occurs, by virtue of separation 

15 (partial or total purification) of some or all of the molecules of that 
milieu, or by virtue of alteration of sequences that flank the claimed 
DNA. The term therefore includes, for example, a recombinant DNA 
which is incorporated into a vector, into an autonomously replicating 
plasmid or virus, or into the genomic DNA of a prokaryote or 

20 eukaryote; or which exists as a separate molecule (e.g., a cDNA or a 
genomic or cDNA fragment produced by polymerase chain reaction 
(PCR) or restriction endpnuclease digestion) independent of other 
sequences. It also includes a recombinant DNA which is part of a 
hybrid gene encoding additional polypeptide sequence, e.g., a fusion 

25 protein. Also included is a recombinant DNA which includes a 
portion of the nucleotides listed in Figure 9 (SEQ ID NO:l) which 
encodes an alternative splice variant of TADG-15. 

The DNA may have at least about 70% sequence 
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identity to the coding sequence of the nucleotides listed in Figure 9 
(SEQID NO:l), preferably at least 75% (e.g. at least 80%); and most 
preferably at least 90%. The identity between two sequences is a 
direct function of the number of matching or identical positions. 
5 When a subunit position in both of the two sequences is occupied by 
the same monorneric subunit, e.g., if a given position is occupied by 
an adenine in each of two DNA molecules, then they are identical at 
that position. For example, if 7 positions in a sequence 
10 nucleotides in length are identical to the corresponding positions 

10 in a second 10-nucleotide sequence, then the two sequences have 
70% sequence identity. The length of comparison sequences will 
generally be at least 50 nucleotides, preferably at least 6 0 
nucleotides, more preferably at least 75 nucleotides, and most 
preferably 100 nucleotides. Sequence identity is typically measured 

15 using sequence analysis software (e.g., Sequence Analysis Software 
Package of the Genetics Computer Group, University of Wisconsin 
Biotechnology Center, 1710 University Avenue, Madison, WI 53705). 

The present invention comprises a vector comprising a 
DNA sequence which encodes a human TADG-15 protein and said 

20 vector is capable of replication in a host which comprises, in operable 
linkage: a) an origin of replication; b) a promoter; and c) a DNA 
sequence coding for said protein. Preferably, the vector of the 
present invention contains a portion of the DNA sequence shown in 
SEQID No:l. A "vector" may be defined as a replicable nucleic acid 

25 construct, e.g., a plasmid or viral nucleic acid. Vectors may be used 
to amplify and/or express nucleic acid encoding TADG-15 protein. 
An expression vector is a replicable construct in which a nucleic acid 
sequence encoding a polypeptide is operably linked to suitable 
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control sequences capable of effecting expression of the polypeptide 
in a cell. The need for such control sequences will vary depending 
upon the cell selected and the transformation method chosen. 
Generally, control sequences include a transcriptional promoter 
and/or enhancer, suitable mRNA ribosomal binding sites, and 
sequences which control the termination of transcription and 
translation. Methods which are well known to those skilled in the art 
can be used to construct expression vectors containing appropriate 
transcriptional and translational control signals. See for example, the 
techniques described in Sambrook et al., 1989, Molecular Cloning: A 
Laboratory Manual (2nd Ed.), Cold Spring Harbor Press, N.Y. A gene 
and its transcription control sequences . are defined as being 
"operably linked" if the transcription control sequences effectively 
control the transcription of the gene. Vectors of the invention 
include, but are not limited to, plasmid vectors and viral vectors. 
Preferred viral vectors of the invention are those derived from 

retroviruses, adenovirus, adeno-associated virus, SV40 virus, or 

.• 

herpes viruses. 

By a "substantially pure protein" is meant a protein 
which has been separated from at least some of those components 
which naturally accompany it. Typically, the protein is substantially 
pure when it is at least 60%, by weight, free from the proteins and 
other naturally-occurring organic molecules with which it is 
naturally associated in vivo. Preferably, the purity of the 
preparation is at least 75%, more preferably at least 90%, and most 
preferably at least 99%, by weight. A substantially pure TADG-15 
protein may be obtained, for example, by extraction from a natural 
source; by expression of a recombinant nucleic acid 

1 8 
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encoding an TADG-15 polypeptide; or by chemically synthesizing the 
protein. Purity can be measured by any appropriate method, e.g., 
column chromatography such as immunoaffinity chromatography 
using an antibody specific for TADG-15, polyacrylamide gel 
5 electrophoresis, or HPLC analysis. A protein is substantially free of 
naturally associated components when it is separated from at least 
some of those contaminants which accompany it in its natural state. 
Thus, a protein which is chemically synthesized or produced in a 
cellular system different from the cell from which it naturally 

10 originates will be, by definition, substantially free from its naturally 
associated components. Accordingly, substantially pure proteins 
include eukaryotic proteins synthesized in E. coli, other prokaryotes, 
or any other organism in which they do not naturally occur. 

In addition to substantially full-length proteins, the 

15 invention also includes fragments (e.g., antigenic fragments) of the 
TADG-15 protein (SEQ ID No:2). As used herein, "fragment," as 
applied to a polypeptide, will ordinarily be at least 10 residues, 
more typically at least 20 residues, and preferably at least 30 (e.g., 
50) residues in length, but less than the entire, intact sequence. 

20 Fragments of the TADG-15 protein can be generated by methods . 
known to those skilled in the art, e.g., by enzymatic digestion of 
naturally occurring or recombinant TADG-15 protein, by recombinant 
DNA techniques using an expression vector that encodes a defined 
fragment of TADG-15, or by chemical synthesis. The ability of a 

25 candidate fragment to exhibit a characteristic of TADG-15 (e.g., 
binding to an antibody specific for TADG-15) can be assessed by 
methods described herein. Purified TADG-15 or antigenic . fragments 
of TADG-15 can be used to generate new antibodies or to 

1 9 
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test existing antibodies (e.g., as positive controls in a diagnostic 
assay) by employing standard protocols known to those skilled in the 
art. Included in this invention are polyclonal antisera generated b y 
using TADG-15 or a fragment of TADG-15 as the immunogen in, e.g., 
-rabbits. Standard protocols for monoclonal and polyclonal antibody 
production known to those skilled in this art are employed. The 
monoclonal antibodies generated by this procedure can be screened 
for the ability to identify recombinant TADG-15 cDNA clones, and to 
distinguish them from known cDNA clones. 

Further included in this invention are TADG-15 proteins 
which are encoded at least in part by portions of SEQ ED NO:2, e.g., 
products of alternative mRNA splicing or alternative protein 
processing events, or in which a section of TADG-15 sequence .has 
been deleted. The fragment, or the intact TADG-15 polypeptide, may 
be covalently linked to another polypeptide, e.g. which acts as a 
label, a ligand or a means to increase antigenicity. 

The invention also includes a polyclonal or monoclonal 

antibody which specifically binds to TADG-15. The invention 

encompasses not only an intact monoclonal antibody, but also an 
immunologically-active antibody fragment, e.g., a Fab or (Fab>2 

fragment; an engineered single chain Fv molecule; or a chimeric 
molecule, e.g., an antibody which contains the binding specificity of 
one antibody, e.g., of murine origin, and the remaining portions of 
another antibody, e.g., of human origin. 

In one embodiment, the antibody, or a fragment thereof, 
may be linked to a toxin or to a detectable label, e.g. a radioactive 
label, non-radioactive isotopic label, fluorescent label, 
chemiluminescent label, paramagnetic label, enzyme 
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i k»i Fxamoles of suitable toxins include 
label, or colorimetric label. Examples 01 . 

diphtheria toxin, Pseutomonas exotoxin A, ricin, and cholera toxin. 
Examples of suitable enzyme labels include malate hydrolase 
staph ylococcal nuclease, delta-5-steroid isomerase, alcohol 
dehydrogenase, alpha-glycerol phosphate dehydrogenase, tno.e 
phosphate isomerase, peroxidase, alkaline phosphatase, asparaginase, 
glucoS e oxidase, beta-galactosidase, ribonuclease, urease, catalase, 

-i a „ oc< 3 giucoamylase, 
glucose-6-phosphate dehydrogenase, S 

Fxamries of suitable radioisotopic labels 
acetylcholinesterase, etc. Examples 

include 3 H , 125,, 131,, 32 P , 35 s , 14 c , etc . 

Paramagnetic isotopes for purposes of in v/v D d.agnos.s . 
can also be used according to the methods of this invention. There 
numerous examples of elements that are useful in magnetic 
resonance imaging. For discussions on in v.vo nuclear magnetic 
resonance imaging, see, for example. Schaefer =t a,.. (1989) MCC.4.. 
472-480; Shreve et al., (1986) Ma S n. Reson. Mei. 3, 336-340; Wolf. G, 
L ^Physiol. Chen,. Phys. Med. NMR 16. 93-95; Wesbey eta... 
M984.) Physiol. Chen,. Phy, Med. NMR 16. . 145-155; Runge et al.. 
(lM 4) !-««. *°*ol. !9, 408-4,5. Examples of suitable fluorescent 
lab e,s include a fluorescein label, an isothiocyalate label, a 
rhodamine label, a phycoery.hrin label, a phycocyanin label, an 
allophycocyanin label, an ophthaldehyde label, a fluorescamine label, 
etc Examples of chemiluminescent labels include a luminal label, a n 
isoiuminal labe,, an aromatic acridinium ester label, an imidazole 
,abel. an acridinium salt label, an oxalate ester label, a luciferin label, 
a luciferase label, an aequorin label, etc. 

Those of ordinary skill in the art will know of other 
suitable labels which may be employed in accordance with 



2 1 



WO 99/42120 PCTAJS99/03436 

the present invention. The binding of these labels to antibodies or 
fragments thereof can be accomplished using standard techniques 
commonly known to those of ordinary skill in the art. Typical 
techniques are described by Kennedy et al„ (1976) Clin, Chim. Acta 
70, 1-31; and Schurs et al., (1977) Clin, Chim. Acta 81, 1-40. Coupling 
techniques mentioned in the latter are the glutaraldehyde method, 
the periodate method, the dimaleimide method, the m - 
maleimidobenzyl-N-hydroxy-succinimide ester method. All of these 
methods are incorporated by reference herein. 

Also within the invention is a method of detecting TADG- 
15 protein in a biological sample, which includes the steps of 
contacting the sample with the labeled antibody, e.g., radioactively 
tagged antibody specific for TADG-15, and determining whether the 
antibody binds to a component of the sample. 

As described herein, the invention provides a number of 
diagnostic advantages and uses. For example, the TADG-15 protein is 
useful in diagnosing cancer in different tissues since this protein is 
highly overexpressed in tumor cells. Antibodies (or antigen-binding 
fragments thereof) which bind to an epitope specific for TADG-15, 
are useful in a method of detecting TADG-15 protein in a biological 
sample for diagnosis of cancerous or neoplastic transformation. This 
method includes the steps of obtaining a biological sample (e.g., cells, 
blood, plasma, tissue, etc.) from a patient suspected of having cancer, 
contacting the sample with a labeled antibody (e.g., radioactively 
tagged antibody) specific for TADG-15, and detecting the TADG-15 
protein using standard immunoassay techniques such as an ELISA. 
Antibody binding to the biological sample indicates that the sample 
contains a component which specifically binds to an epitope 
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within TADG-15. 

Likewise, a standard Northern blot assay can be used to 
ascertain the relative amounts of TADG-15 mRNA in a cell or tissue 
obtained from a patient suspected of having cancer, in accordance 
5 with conventional Northern hybridization techniques known to those 
of ordinary skill in the art. This Northern assay uses a hybridization 
probe, e.g. radiolabelled TADG-15 cDNA, either containing the full- 
length, single stranded DNA having a sequence complementary to 
SEQID NO:l (Figure 9), or a fragment of that DNA sequence at least 

10 20 (preferably at least 30, more preferably at least 50, and most 
preferably at least 100 consecutive nucleotides in length). The DNA 
hybridization probe can be labeled by any of the many different 
methods known to those skilled in this art. 

Antibodies to the TADG-15 protein can be used in an 

15 immunoassay to detect increased levels of TADG-15 protein 
expression in tissues suspected of neoplastic transformation. These 
same uses can be achieved with Northern blot assays and analyses. 

The present invention is directed to DNA encoding a 
TADG-15 protein selected from the group consisting of: (a) isolated 

20 DNA which encodes a TADG-15 protein; (b) isolated DNA which 
hybridizes to isolated DNA of (a) above and which encodes a TADG-15 
protein; and (c) isolated DNA differing from the isolated DNAs of (a) 
and (b) above in codon sequence due to the degeneracy of the genetic 
code, and which encodes a TADG-15 protein. Preferably, the DNA has 

25 the sequence shown in SEQ ID No:l. More preferably, the DNA 
encodes a TADG-15 protein having the amino acid sequence shown in 
SEQIDNo:2. 

The present invention is also directed to a 
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vector capable of expressing the DNA of the present invention 
adapted for expression in a recombinant cell and regulatory elements 
necessary for expression of the DNA in the cell. Preferably, the vector 
contains DNA encoding a TADG-15 protein having the amino acid 

5 sequence shown in SEQ ID No:2. 

The present invention is also directed to a host cell 
transfected with the vector described herein, said vector expressing a 
TADG-15 protein. Representative host cells include consisting of 
bacterial cells, mammalian cells and insect cells. 

The present invention is also directed to a isolated and 
purified TADG-15 protein coded for by DNA selected from the group 
consisting of: (a) isolated DNA which encodes a TADG-15 protein; (b) 
isolated DNA which hybridizes to isolated DNA of (a) above and which 
encodes a TADG-15 protein; and (c) isolated DNA differing from the 
15 isolated DNAs of (a) and (b) above in codon sequence due to the 
degeneracy of the genetic code, and which encodes a TADG-15 
protein. Preferably, the isolated and purified TADG-15 protein of 
claim 9 having the amino acid sequence shown in SEQ ID No:2. 

The present invention is also directed to a method of 
detecting expression of the protein of claim 1, comprising the steps 
of: (a) contacting mRNA obtained from the cell with the labeled 
hybridization probe; and (b) detecting hybridization of the probe 

with the mRNA. 

The following examples, are given for the purpose of 
25 illustrating various embodiments of the invention and are not meant 
to limit the present invention in any fashion. 
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5 Tissue collection a nd storage 

Upon patient hysterectomy, bilateral salpingo- 
oophorectomy, or surgical removal of neoplastic tissue, the specimen 
is retrieved and placed it on ice. The specimen was then taken to the 
resident pathologist for isolation and identification of specific tissue 
samples. Finally, the sample was frozen in liquid nitrogen, logged into 
the laboratory record and stored at -80°C. Additional specimens were 
frequently obtained from the Cooperative Human Tissue Network 
(CHTN). These samples were prepared by the CHTN and shipped to us 
on dry ice. Upon arrival, these specimens were logged into the 
15 laboratory record and stored at -80°C. 
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20 mRNA isolation and dDNA synthesis 

Forty-one ovarian tumors (10 low malignant potential 
tumors and 31 carcinomas) and 10 normal ovaries were obtained 
from surgical specimens and frozen in liquid nitrogen. The human 
ovarian carcinoma cell lines SW 626 and Caov 3, the human breast 

25 carcinoma cell lines MDA-MB-231 and MDA-MB-435S, and the human 
uterine cervical carcinoma cell line Hela were purchased from the 
American Type Culture Collection (Rockville, MD). Cells were cultured 
to subconfluency in Dulbecco's modified Eagle's medium, 
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suspended with 10% (v/v) fetal bovine serum and antibiotics. 

Messenger RNA (raRNA) isolation was performed 
according to the manufacturer's instructions using the Mini RiboSep 
Ultra mRNA isolation kit purchased from Becton Dickinson (cat. # 
5 . . 30034). This was an oligo(dt) chromatography based system of 
mRNA isolation. The amount of mRNA recovered was quantitated by 
UV spectrophotometry. 

First strand complementary DNA (cDNA) was synthesized 
using 5.0 mg of mRNA and either random hexamer or oligo(dT) 
10 primers according to the manufacturer's protocol utilizing a first 
strand synthesis kit obtained from Clontech (cat.# K1402-1). The 
purity of the cDNA was evaluated by PCR using primers specific for 
the p53 gene. These primers span an intron such that pure cDNA can 
be distinguished from cDNA that is contaminated with genomic DNA. 

15 

EXAMPLE 3 

PCR reactions 

The mRNA overexpression of TADG-15 was determined 
20 using a quantitative PCR. Oligonucleotide primers were used for: 
TADG-15, forward 5 ' - ATGAC AG AGG ATTC AGGT AC-3 ' (SEQ ID NO: 10) 
and reverse 5'-GAAGGTGAAGTCATTGAAGA-3' (SEQ ED NO: 11); and p- 

tubulin, forward 5 ' -TGC ATTG AC AACG AGGC-3 ' (SEQ ID NO: 12) and 
reverse 5 '-CTGTCTTG AC ATTGTTG-3 ' (SEQ ID NO: 13). p -tubulin was 

25 utilized as an internal control. Reactions were carried out as follows: 
first strand cDNA generated from 50 ng of mRNA will be used as 
template in the presence of 1.0 mM MgCl 2 , 0.2 mM dNTPs, 0.025 U 
Taq polymerase/ml of reaction, and 1 x buffer supplied with 
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enzyme. In addition, primers must be added to the PCR reaction. 
Degenerate primers which may amplify a variety of cDNAs are used 
at a final concentration of 2.0 mM each, whereas primers which 
. amplify specific cDNAs are added to a final concentration of 0.2 mM 
each. 

After initial denaturation at 95°C for 3 minutes, thirty 
cycles of PCR are carried out' in a Perkin Elmer Gene Amp 24 00 
thermal cycler. Each cycle consists of 30 seconds of denaturation at 
95°C, 30 seconds of primer annealing at the appropriate annealing 
temperature, and 30 seconds of extension at 72°C. The final cycle will 
be extended at 72°C for 7 minutes. To ensure that the reaction 
succeeded, a fraction of the mixture will be electrophoresed through a 
2% agarose/TAE gel stained with ethidium bromide (final 
concentration 1 mg/ml). The annealing temperature varies according 
to the primers that are used in the PCR reaction. For the reactions 
involving degenerate primers, an annealing temperature of 48°C were 
used. The appropriate annealing temperature for the TADG-15 and (3- 

tubulin specific primers is 62°C. 

EXAMPLE 4 

T-vector ligation and transformations 

The purified PCR products are ligated into the Promega T- 
vector plasmid and the ligation products are used to transform JM109 
competent cells according to the manufacturer's instructions (Promega 
cat. #A3610). Positive colonies were cultured for amplification, the 
plasmid DNA isolated by means o f the Wizard' Mi.nipreps DNA 
purification system (Promega cat #A7500), and the plasmids were 
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digested with Apal and Sad restriction enzymes to determine the size 
of the insert. Plasmids with inserts of the size(s) visualized by the 
previously described PCR product gel electrophoresis were sequenced. 

5 ~ EXAMPLE 5 

DNA sequencing 

Utilizing a plasmid specific primer near the cloning site, 
sequencing reactions were carried out using PRISM Ready Reaction 

10 Dye Deoxy™ terminators (Applied Biosystems cat# 401384) according 
to the manufacturer's instructions. Residual dye terminators were 
removed from the completed sequencing reaction using a Centri- 
sep™ spin column (Princeton Separation cat.# CS-901). An Applied 
Biosystems Model 373A DNA Sequencing System was available and 

15 was used for sequence analysis. Based upon the determined 
sequence, primers that specifically amplify the gene of interest were 
designed and synthesized. 

* 

EXAMPLE 6 

20 

Northern blot analysis 

10 n.g mRNAs were size separated by electrophoresis 

through a 1% formaldehyde-agarose gel in 0.02 M MOPS, 0.05 M 
sodium acetate (pH 7.0), and 0.001 M EDTA. The mRNAs were then 
25 blotted to Hybond-N (Amersham) by capillary action in 20 x SSPE. 
The RNAs are fixed to the membrane by baking for 2 hours at 80°C. 
Additional multiple tissue northern (MTN) blots were purchased from 
CLONTCCH Laboratories, Inc. These blots include the Human 
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MTN blot (cat.#7760-l), the Human MTN II blot (cat.#7759-l), the 
Human Fetal MTN II blot (cat.#7756-l), and the Human Brain MTN 
III blot (cat.#7750-l). The appropriate probes were radiolabeled 
utilizing the Prime-a-Gene Labeling System available from Promega 
(cat#U1100). The blots were probed and stripped according to the 
ExpressHyb Hybridization Solution protocol available from CLONTECH 
(cat.#8015-l or 8015-2). 
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Quantitative PCR 

Quantitative-PCR was performed in a reaction mixture 

consisting of cDNA derived from 50 ng of mRNA, 5 pmol of sense and 
antisense primers for TADG-15 and the internal control P-tubulin, 0.2 
mmol of dNTPs, 0.5 mCi of [a- 32 P]dCTP, and 0.625 U of Taq 
polymerase in 1 x buffer in a final volume of 25 ml. This mixture was 
subjected to 1 minute of denaturation at 95°C followed by 30 cycles of 
denaturation for 30 seconds at 95°C, 30 seconds of annealing at 62°C, 
and 1 minute of extension at 72°C with an additional 7 minutes of 
extension on the last cycle. The product was electrophoresed through 
a 2% agarose gel for separation, the gel was dried under vacuum and 
autoradiographed. The relative radioactivity of each band was 
determined by Phospholmager from Molecular Dynamics. 
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The present invention describes the use of primers 
directed to conserved areas of the serine protease family to 
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identify members of that family which are overexpressed in 
carcinoma. Several genes were identified and cloned in other tissues, 
but not previously associated with ovarian carcinoma. The present 
invention describes a protease identified in ovarian carcinoma. This 
-gene was identified using primers to the conserved area surrounding 
the catalytic domain of the conserved amino acid histidine and the 
downstream conserved amino acid serine which lies approximately 
150 amino acids towards the carboxyl end of the protease. 

The gene encoding the novel extracellular serine protease 
of the present invention was identified from a group of proteases 
overexpressed in carcinoma by subcloning and sequencing the 
appropriate PCR products. An example of such a PCR reaction is given 
in Figure 1. Subcloning and sequencing of individual bands from such 
an amplification provided a basis for identifying the protease of the 
present invention. 

EXAMPLE 9 

The sequence determined for the catalytic domain of 
TADG-15 is presented in Figure 2 and is consistent with other serine 
proteases and specifically contains conserved amino acids appropriate 
for the catalytic domain of the trypsin-like serine protease family. 
Specific primers (20mers) derived from this sequence were used. 

A series of normal and tumor cDNAs were examined to 
determine the expression of the TADG-15 gene in ovarian carcinoma. 
In a series of normal derived cDN A compared to carcinoma derived 
cDNA using {3-tubulin as an internal control for PCR amplification, 
TADG-15 was significantly overexpressed in all of the 
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carcinomas examined and either was not detected or was detected a t 
a very low level in normal epithelial tissue (Figure 3). This evaluation 
was extended to a standard panel of about 40 tumors. Using these 
specific primers, the expression of this gene was also examined in 
5 tumor, cell lines derived from both ovarian and breast carcinoma 
tissues as shown in Figure 5 and in other tumor tissues as shown in 
Figure 6. The expression of TADG-15 was also observed in carcinomas 
of the breast, colon, prostate and lung. 

Using the specific sequence for TADG-15 covering the full 
10 domain of the catalytic site as a probe for Northern blot analysis, 
three Northern blots were examined: one derived from ovarian 

m< 

tissues, both normal and carcinoma; one from fetal tissues; and one 
from adult normal tissues. As shown in Figure 7, TADG-15 transcripts 
were noted in all ovarian carcinomas, but were not present in 

15 detectable levels in any of the following tissues: a) normal ovary, b) 
fetal liver and brain, c) adult spleen, thymus, testes, overy and 
peripheral blood lymphocytes, d) skeletal muscle, liver, brain or 
heart. The transcript size was found to be approximately 3.2 kb. The 
hybridization for the fetal and adult blots was appropriate and done 

20 with the same probe as with the ovarian tissue. Subsequent to this 
examination, it was confirmed that these blots contained other 
detectable mRNA transcripts 

Initially using the catalytic domain of the protease to 
probe Hela cDNA and ovarian tumor cDN A libraries, one clone was 

25 obtained covering the entire 3' end of the TADG-15 gene from the 
ovarian tumor library. On further screening using the 5' end of the 
newly detected clones, two more clones were identified covering the 
5' end of the TADG-15 gene from the Hela library (Figure 8). The 



3 1 



WO 99/42120 PCT/US99/03436 

complete nucleotide sequence (SEQ ID No:l) is provided in Figure 9 
along with translation of the open reading frame (SEQ ID No:2). 

In the nucleotide sequence, there is a Kozak sequence 
typical of sequences upstream from the initiation site of translation. 
5 There is also a poly-adenylation signal sequence and a poly- 
adenylated tail. The open reading frame consists of a 855 amino acid 
sequence (SEQ ID No:2) which includes an amino terminal cytoplasmic 
tail from amino acids 1-50, an approximately 22 amino acid 
transmembrane domain followed by an extracellular sequence 

10 preceding two CUB repeats identified from complement 
subcomponents - Or and Cls. These two repeats are followed by four 
repeat domains of a class A motif of the LDL receptor and these four 
repeats are followed by the protease enzyme of the trypsin family 
constituting the carboxyl end of the TADG-15 protein (Figure 11). 

15 Also a clear delineation of the catalytic domain conserved histidine, 
aspartic acid, serine series along with a series of amino acids 
conserved in the serine protease family is indicated (Figure 10). 

A search of GeneBank for similar previously identified 
sequences yielded one such sequence with relatively high homology 

20 to a portion of the TADG-15 gene. The similarity between the portion 
of TADG-15 from nucleotide #182 to 3139 and SNC-19 (SEQ ID No: 9; 
GeneBank accession #U20428) is approximately 97% (Figure 12). 
There are however significant differences between SNC-19 and TADG- 
15 viz. TADG-15 has an open reading frame of 855 amino acids 

25 whereas the longest ORF of SNC-19 is only 173 amino acids. SNC-19 
does not include a proper start site for the initiation of translation nor 
does it include the amino terminal portion of the protein encoded b y 
TADG-15. Moreover, SNC-19 does not include an ORF for a 
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functional serine protease because the His, Asp and Ser residues 
necessary for function are encoded in different reading frames. 

TADG-15 is a highly overexpressed gene in tumors. It is 
expressed in a limited number of normal tissues, primarily tissues 
5 that are involved in either uptake or secretion of molecules e.g. colon 
and pancreas. TADG-15 is further novel in its component structure, of 
domains in that it has a protease catalytic domain which could b e 
released and used as a diagnostic and which has the potential for a 
target for therapeutic intervention. TADG-15 also has ligand binding 
10 domains which are commonly associated with molecules that 
internalize or take-up ligands from the external surface of the ceU as 
does the LDL receptor for the LDL cholesterol complex. There is 
potential that these domains may be involved in uptake of specific 
ligands and they may offer the potential for making delivery of toxic 
15 molecules or genes to tumor cells which express this molecule on their 
surface. It has features that are similar to the hepsin serine protease 
molecule in that it also has an amino-terminal transmembrane 
domain with the proteolytic catalytic domain extended into the 
. extracellular matrix. The difference here is that TADG-15 includes 
20 these ligand binding repeat domains which the hepsin gene does not 
have. In addition to the use of this gene as a diagnostic or therapeutic 
target in ovarian carcinoma and other carcinomas including breast, 
prostate, lung and colon, its ligand-binding domains may be valuable 
in the uptake of specific molecules into tumor cells. 
25 Any patents or publications mentioned in this 

specification are indicative of the levels of those skilled in the art to 
which the invention pertains. . These patents and publications are 
herein incorporated by reference to the same extent as 
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if each individual publication was specifically and individually 
indicated to be incorporated by reference. 

One skilled in the art will readily appreciate that the 
present invention is well adapted to carry out the objects and obtain 
5_ the ends and advantages mentioned, as well as those inherent therein. 
The present examples along with the methods, procedures, 
treatments, molecules, and specific compounds described herein are 
presently representative of preferred embodiments, are exemplary, 
and are not intended as limitations on the scope of the invention. 
10 Changes therein and other uses will occur to those skilled in the art 
which are encompassed within the spirit of the invention as defined 
by the scope of the claims. 
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WHAT IS CLAIMED IS: 
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!. UNA encoding a TADG-15 protein selected from the 

aroup consisting of: 

(a) isolated DNA which encodes a TADG-15 protein; 

(b) isolated DNA which hybridizes to isolated DNA of (a) 
above and which encodes a TADG-15 protein; and 

(c) isolated DNA differing from the isolated DNAs of (a) 
and (b) above in codon sequence due to the degeneracy of the genetic 
code, and which encodes a TADG-15 protein. 



The DNA of claim 1. wherein said DNA has the 



15 sequence shown in SEQ ID No:l. 



3. 



The DNA of claim 1. wherein said TADG-15 protein 



has the amino acid sequence shown in SEQ ID No:2. 



25 



4 A vector capable of expressing the DNA of claim 
! adapted for expression in a recombinant cell and regulatory 
elements necessary for expression of the DNA in the cell. 

5 The vector of claim 4, wherein said DNA encodes a 
TADG-15 protein having the amino acid sequence shown in SEQ ID 
No:2. 
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6. A host cell transfected with the vector of claim 4, 
said vector expressing a TADG-15 protein. 

7. The host cell of claim 6, wherein said cell is selected 
from group consisting of bacterial cells, mammalian cells, plant cells 
and insect cells. 



8. The host cell of claim 7, wherein said bacterial cell is 

E. coli. 

9. Isolated and purified TADG-15 protein coded for by 
DNA selected from the group consisting of: 

(a) isolated DNA which encodes a TADG-15 protein; 

(b) isolated DNA which hybridizes to isolated DNA of (a) 
above and which encodes a TADG-15 protein; and 

(c) isolated DNA differing from the isolated DNAs of (a) 
and (b) above in codon sequence due to the degeneracy of the genetic 
code, and which encodes a TADG-15 protein. 



10. The isolated and purified TADG-15 protein of claim 
9 having the amino acid sequence shown in SEQ ID No:2. 



PCT/US99/03436 

WO 99/42120 

U. A method of detecting expression of the protein of 

claim 1, comprising the steps of: 

(a) contacting mRNA obtained from the cell with the 

_ labeled hybridization probe; and 

(b) detecting hybridization of the probe with the mRNA. 
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I TCEAGAGCGGCCTCGSSSTACCATGGGG^SCC^TggSf^C^ 

.- , „ C 5 ORARKGCCGPK D FGAG1K. YNSR HEKVNCLE 

... GgwAGGCGTCGAto i iio. g GCCAGTCAACA*C^CAACAAGGTGGAAAAGCATGC=CCCGGGC GrTGSCT^ 7 1 1 . . ■ ■ -~~», , 

^ " G V E r L P V N N V K KV E K H G ?G R f w V V LA A V L I G L L L V l l - I r" 8 
.;i CT7^CTqGT GTCGC^TTTGCASTA^GGCACGTCCCTGTCCACAAGCTCTTC^ 

' L v W J H L Q Y R DVRVQK. VrU'GTH R I TKENTVOAY EN S*S - *eT 
j61 iCTAAGCCTGGCt^GCAACGTGAACGACGCGCTGAAGCTGCTCTACAGCCGA^T 

VSLASKVKDALKLLTSGVPrLGPYKKiSAVTAFSEGS " i - 
< 9 i C7AC7ACTGGTCTGAOTCACCATCCCGCAG 

* * " * * E ,f,, s , J pohlveeaesvhaeervvmlpprarslk S r V V. 

tS* C^CTC^GTOTGGCTTTCCCCACCGACTCCAAAACACT 

TSVVATPTDS KTVORTQD HSCS rGLHARGV? L M R F T T 5 » c 

' 3 - c ; CT ?*^ ccc 5 CT ^S CCCCT ^ TCCCCGCTGCC ^ COTCCTCraOT3 ^ 

?OSP -P««ARCOWALacrA0SVLSLT-RSrDLA S r D ^T^s 
eU C^CTCGTCACGGTGTACAACACCCTO 

0. tVTVTHTLSPHE?aALVOLCCTT»?SYNLTFHSS OK < * L i 
" 4 ^ T ^^fT^TA^CAP.CACTGAGCGGCGGCATCCC^CTTT(lACCX 

" * L I. T I* T E RR H PCFEATrrOL PRMSSCGGRLRKAOGT?]?* 
~' S 1 ^S^^Z'^^T^^^^^^ ^ ^ ^^-CTACiCCACCCAACATTGACTCCACAT GtCAACA^TTuACSTCCCCAACAAC CACCAT w T Cv-SCTC^CCTT CrtAATTCrTT ACCTTSCTGIGAjCCC^'C 3CgWg^C^ 

?Y { PCHTPPN IDCrWMIEVPNNOHVXVSrKfFYLL EP * V ? 
20i T C CGG GCACCTGCCCCAAGG.\CTACGT GGAC^T CAATGGGG.^tlr l - £ u : iT ACT C C -Z ^1 2 ^^GG7 CCCAGT7CG7CC TC^CC^CCAACAC CAACAACA7 CACAGT TCGCTTCCAT^ T rz. 

_ A G T C PKDY VE I MGEKYCCiRS 0 FVVTS^SNK I T V R *» k s"d 3 

"* CTCCTACACCG^CCGCCTTCTTACCTG^TACCTCTCCTACGACTCC^^^ 

.J. YTDlG - LA *Yl.SYDSSD?CPGOrTCaTGRClRKr L , r q r 

" - \\ GmGCCGAC I gc C ccgaccacac ^ 

-ADCTDHSDE LNCSCDAGKOFT CK NK r CXPLF"rfV C IIS "v* 
£i "CCGGAGACAACACCG^G.AGCACGCGTGCAGTTCTCCGG^ 

e . CGDNSDEQCCSCPAOTFSCSNCKCLSXSOOCNGK OD C G d"g 
" - ^CCCTXGAGGCCTCCTGCCCCAJU^C^^ 

* ? EASC ?KVNVVTCTKH7YRCLNCLCLSKGNPECD cT^D C 
>. '^^CGCTCAGATGAGAAGGACTC^GACTGTCCOT 
- 5DGS DEKDC DCGLRS F7aQARVVGG79ADCGEW P W O V S~ t v 

Zi TCCTCTGGGCCP^CCCACATCTGCGGTGCTTCCCTCATCTCTCCC^Ap-G - " Q V S L « 

A t G 0 G HICGASLISPN W 

GCCCTTCCTGCGCT7GCACGACCAGAGCCAGCGCAGCCCCCCTGGGGTGCA 
AFLCLHDQSORSAPCVO 



« — ^ ~ ~ „ " " UiUKSAPCVQiSRLK RI ISHPTP-N D F 7 F D Y (CO I A 
tl ^^CCFGGMCTGG^AGAAACCGCCAGAGTAGACCTCCATGGTrc 

g . r Jr. J:, = ^ * ?AEYSSMVRPICL PDASHVr?AGKAIWVTC» G H7 

- 1 C ?^" 1 ^^^"^^^^TCCTCCAAAAGGCTC^ 

^"'GCTGALI LQKGEIRVZNOTTCEN 1 7 POOZTbrk^CV G ~ 
toi CCTCAGCWCGC-*CCACTCCTGCC}U«OTG^ 

r««.f- JU*JLJ^J!.« C ° CD®GGPLS5VEADCRirOAGVVSWC 

- 4 g q aca ^ ca 5 gcg 7 c I acacaag ^^ 

nk?gvytrlplfrom:kentgv 



:GGAGACCGC7GCCC7CAGAG 
c O G C A o a 
CCAGCCAAA7GTGTACACCTC^CGGCCCACCt^rCS7CCAC 
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MGSDRARKGG GGPKDFGAGL KYNSRHEKVN GLEEGVEFLP VNNVKKVEKH 



GPGPWVVLAA VLIGLLLVLL GIGFLVWHLQ YRDVRVQKVF NGYMRITNEN 



FVDAYENSNS T^FVSLASKV KDALKLLYSG VPFLGPYHKE SAVTAFSEGS 



VIAYYWSEFS I PQHLVEEAE RVMAEERVVM LPPRARSLKS FVVTSVVAFP 
TDSKTVQRTQ DNS 



CSFGLHA RGVELMRFTT PGFPDSPYPA HARCQWALRG 



DADSVLSLTF RSFDLASCDE RGSDLVTVYN TLS PMEPHAL VQLCGTYPPS 

* 



Y tNLTf HSSQN VLLI TLITNT ERRHPGFEAT FFQLPRMSSC GGRLRKAQGT 



FNSPYYPGHY PPNIDCTWNI EVPNNQHVKV SFKFFYLLEP GVPAGTCPKD 
YVEINGEKYC GERSQFWTS NSNKITVRFH SDQS YTDTGF LAEYLSY 



DSS 



DPCPGQFTCR TGRCI RKELR CDGWADCTDH (SDEJLNCSCDA GHQFTCKNKF 



CKPLFWVCDS VNDCGDN 3DE QGCSCPAQTF RCSNGKCLSK SQQCNGKDDC 



IT 



GD dSDEK SCP KVNVVTCTKH TYRCLNGLCL SKGNPECDGK EDCSDC JSDEfc 



DC 



DCGLRSFT RQAR 



7VGGTD ADEGEWPWQV SLHALGQGH I CGASLISPNW 

LVSAA@CYID DRGFRYSDPT QWTAFLGLHD QSQRSAPGVQ ERRLKRI I SH 

PFFNDFTFDY ©TALLELEKP AEYSSMVRPI CLPDASHVFP AGKAI WVTGW 

GHTQYGGTGA LILQKGEIRV INQTTCENLL PQQITPRMMC VGFLSGGVDS 

CQGC©GGPLS SVEADGRIFQ AGVVSWGDGC AQRNKPGVYT RLPLFRDWIK 



ENTGV 



1 

2 



( SEQ. ID NO : 2 ) 



|NXT] 



|SDE 



o 



Conserved cysteine residue 
Possible N-linked giycosylation site 
Conserved SDE motif 
Potential cleavage site 

Conserved amino acids of catalytic triad H, D, S 



1. Cytoplasmic domain 

2. Transmembrane domain 

3. CUB repeat 

4. Ligand-binding repeat (class A motif) 
of LDL receptor like domain 

5. Serine protease 
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LOCUS 

DEFINITION 

ACCESSION 

NIO 

KEYWORDS 
SOURCE 
ORGANISE 



KSU2042B 2900 bp oW^- 

Human SNCI9 mRKA seouence. 
U20428 
gl890631 



F.ErEREI-CE 

AUTHORS 

TITLE 

journal 

r.ErEn2!*CE 
AUTHORS 
TITLE 
JOUAKAL 



human. 
Ko.no sapiens 

Eukaryotae; oi tocnoncrial eukaryct**.- :<*z*zzz: Char 
Vertebrata; Eutheria; Primates; Caiarr-ir.i; :-:_.-inic 

1 tbases 1 Co 2900) 
Zheng, S., Cai.X. , Geng.L., Cao.J.. l:.i:.c,L. a.-.-l -1*.; 
SNC19 gene in Hcz^o sapiens 
U'-o-jblisKed 

'2 (bases 1 to 29001 

Direct Submission 

S -bait ted (30- JAN- 1 995 ( Shu Zheng. liner ::.-:;;i:ce 
!-'.ed i ca I University. Her.gincu. 3 1 C 0 C > . ?-srcl*s .-.ep-jt 



■ «H«3 < 



j ia: 
: C: 



7AOG1 5 : TCA*C_VXGGCCTCGGGGTACCATGGGG-.r-7^ 8 1 

s::ci9: 

S 2 CAAGTACA^ r 7CCCGGC-.C£ £ .GAAAGTGAATGGCTTGGAGGAAGGCG7 I- 1-.37 7 CC7G 7 3A.G7CA-. -AACG7 7 AAGAAGG 7 GGAAAAGCA.7CGC CC CGGG 1 S 1 



IS2 CGCTCGGTGCTGCTGGCAGCCCTGCTGATCGGCCTCCTCm^ 291 
I i i i i m i ; 1 1 1 1 1 1 ( I I I l II I I l I I 1 1 1 1 1 1 I H t M 1 1 1 1 1 1 I i : . ii I I i : i m i I 1 1 i; : I l I > I M 1 1 I I MM I I I I I I 1 1 l I :, H 1 1 1 1 
1 ' CGCTGGGTGGTGCTGGCAGCCGTGCTGATCGGCCrCCTCTTG 100 

I- S2 C.-AGGTC77C 1 A.7GGC7ACATGA.GG^7CA.CAAA7GAGAATT7TG7G GA.7 :-CC7ACa-.aVACTCC-AC7CCVC7C-.G777G7AAGCCTC^CC^GC^GG7 381 

l i II || | | | | | | || | | || | | | I || II I M I t II I II I I I I M I I M I = : * ! I I I I I ! 1 ! 1 II I I M I I I I M » I 'ill!!! HI ilj.' I M I I 1 I II l-l 
1 C- 1 A^-A.GGTCTTC-A7GGCTACATGAGGA7CAC/AATGAGAAT777G7G 3A7 G CC7AC GA 72AA.C7 C CAA77 C CAC T GA Z- 7 . 7 G7AAGCC7GGCCAGCiAGG7 2 00 

-. t -» i '--GACCCGC7 GAAGCT G CT G7ACA.G C GGAG 7CCCAT7CCTGG3CC Z Z 7 A7CACA.AG3.-.G7CGG77G7GACGGCC7 7C i .GCGA,GGGCA.GCG7CA7CGCC 4 6 1 

; I ; II I I I ; I I I M M II I M M I I M II I II I t II M M II I I M I . : ! I ! : : : : ; ! I I I - : : : | I I II II M ! I I I II I I I II I | M | I I M M 
TCI GAIGGACGCGCTGAAGCTGCTGTACAGCGCAG7C^ 300 

;P2 7ACTAC7CG7C7GACTTCAGttTCCCG 581 

i 1 1 1 1 1 1 1 1 1 1. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M m 1 1 1 1 ; i . ; i i m ; «. : : 1 1 1 i ; : i ; I i » i ; 1 1 : » 1 1 1 1 1 1 i 1 1 1 n 1 1 1 1 1 

j 0 1 TACTACTGG7CTG,= GTTCAGC5VTCCCCCAGCACCTGGTTGAGGAGGCC GA.GC GC G7 CA7 GGCC . A3 GAGCGCG7AGTCA7C-CTGCCCCCGCGGGCGCGCT 399 

bB2 CCCTC_™7CC777G7GGTCACCra «*- 
I I I II I I M I I J t I I I I I I II I I M I I M I I I M I M I I I I I M t I I ! : ; : M M II i I t t I M ! t 1 I M I I I I H M 1 12 M I I I ! I I I I I I I II I I I 

; 00 CCC7G.-AG7CC777GTGGTCACCTCAGTGGTGGCTTTCCCCACGG.AC7 7 EAAAAGA.G7A.E^.G\GG. : .7CC^GGA.C. 1 --.CA7A- . GCAC-C777GGCC7GCACG . 4 98 
6=2 CCGCC-G7G7GGA.GC7GATGCGCTTCACCACGCCCGGCTTCCCTG.-r.-.7: 

1 1 1 1 1 1 1 1 1 1 1 : : : i : ; : i I m it 1 1 1 1 1 1 : m i m i M m n m i • • 

; 9 9 CCGCGG7G7C-GAGC7GA7GCGC7TCACCACG . CCGGCTTCCCTCACAG: 

7-92 7 CAG7GC7GAGCC7CACCTTC C GCAGCTTTGACC7 TGCG7 C C 7 Z- Z GA. 3 GA. GC G Z GGCAGC 3A.CC7 GG7 GACG G7 G 7 ACAA.CACC C7 GAGCC C CAT e ' 6 

1 1 mi ii i 1 1 1 1 1 1 1 1 1 1 1 hi ii 1 1 1 i 1 1 1 1 1 n i ; 1 1 h m 1 1 1 1 I I 1 I I I I 1 II I M II I I I I I I 

593 GC-.G7GC7GA.GC7AC7CGAGCTGAC7CCCAGC . 7TGAC7CCGCCT Z GA.CGA.G CC-7GGCAGC 3ACCTGG7GAC . G7G7AC-A.CACCCTGAGCCCCAT 6s 6 

S 7 7 GGAGCCCCA.CGCCCTGGTGCAGTTG7GTGGCACCTACCCTCCC7CCTA7w AACC7GA.CC77CCAC7 . CC7CCCA . GAA.CGTC CT G CT CA7CACAC7GATAA 97 4 

I I I I I I I I I I I I I I I I I I | | M I I I I I I I I I I I I M J I I I I I :: i M I I I I I I I I I M I I I I I I I I I I I I 1 I t I 1 I I 1 I I I I I II I M I t f I 1 

65 7 GGAGCCCCACG . CC7GGTG . . AGTGTGTGGCACCTACCCTCCCTCC7Ar_-.ACC TGACCTTCCACTCCCTCCCACa-ACGTCCTGCTCATCACACTGATAA 7 B 3 

9 7 5 CC^CAC7a-.GCGGCGGCATCCCGGCTT7CAGGCCACCrrCTTCCAG 77 7 7C7A.GGA7 GA.GCAGC7 G7 GGA-C^CC GCT7A.C07 AAACCCCAGGGCACATT 107 4 

I I I I I I I I I I | | | | | | | I M I I I I I I I I I M I I I I M I I I M I I I !: I I I I I I ; II I I I 1 I I I M I I I I I I I I > I M I M I I I I I II I I I II I I I I I 

78 4 CCAACACTGA . . C G CGGCAT CCCGGCT TT GAGGC CACCTTCT7 CCAG 3 7 G 3 C 7A.G GA.7 GAG CAG C7 G7GGA.GGC C GC7 . ACG7 AAAGC C CAG GGCACA77 BS 1 

: 0 7 5 CAACAS 7CCC7A77ACCCAGGCCAC7ACCCACCC a AC I VTTG = .CTGC-.7- i .7 Z 3%AA7A77 GAGCTG Z 3 CAACAAC CAGCA7 G7 GAAGG7a-.GC7TCAAATTC 117 4 

I I i I 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 i n 1 1 1 i M 1 1 M n 1 1 1 1 1 i i i i t i i i i 1 1 1 1 1 1 ; r i ! i M i 1 1 1 1 1 1 m I i i i 1 1 1 iiiiiiiiiii 

6 S 2 CAACAGCCCCTAC7 ACCCAGGCCACTACCCACCCAACATTaACTGC-.Z. -.7 GG.-A.-_- .7 7 GAGG7CCCCA-.--ACCAGCA7 w . GAA.GG7 GCGCT7 C-AA, i i C 5c 1 

UTS 77C7ACC7GCTGGACCCCGGCG7GCCTGCGGGCACC7GCCCC-AGG.-.77AC 3-7GGA.GA7CAATGG3-7LA3vAAV:A37G7G^ 127 4 

I M I I I I I I H M I II I I I I I I I I I I I I I I I I t I I I I I I I I II I 1 M i :;: M M I : I 1 I I I I I ! I I M I I i I ! M I i i I i I t I t I I I I II I H I I M I I 
59 2 77CTACC7GCTGaAGCCCGGCCTGCCTGCGGGC^CCTC^CCCAiC^AG7ArG7GG.-.C--.7 CAJ^TGG3 G-^G-A a A.7 AC7 GC GGA.GA.GA.GG7CCCAGTTCG7CG 1081 

1275 TCi.CC-.C^AACaGCAACAJ^^ 1214 

m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii u M M 1 1 1 1 1 1 I n I 1 1 1 n 1 1 1 1 1 ; i i i t i i • i i i 1 1 1 1 1 ;: 1 1 n i M ; i 1 1 ! 1 1 1 i i 1 1 i ii 1 1 1 1 M 1 1 1 1 1 1 

; GS2 7CACCAGCAACAGC=_-.CAAC^TC\CAG^ 7 Z 7 ACA.C C GA. CA.CCGG C 7 7 C 7 7 AGC 7 GAA7 AC C 7 C 7 C C7 AC GACT CCACTGA 1161 

137 5 C Z CATGCC CGG G G CA. G T T CACGTGCC GCACG G GGCGGT G7 ATC CG GAA 3 7=A G 7 7 G 7 G 7 7 G7 GA.7 G 7 G Z-C-ZZ GA 77 G 7 7 GAC C-.CAGCGA.7G?.GC7C 1 4 7 < 

i i i I i I 1 1 1 1 1 ! 1 1 ; i • i I I 1 1 1 1 1 1 i I i I 1 1 1 1 1 1 1 I 1 1 1 1 1 1 ! I i ' : ; : ; : i i i ; 1 1 i i • • : 1 1 1 i i : 1 1 I 1 1 ; ; ; M I I i I i i i 1 1 1 M I 1 1 1 1 

i 1 - 2 CCCA7GCCCGGGGCAG7 7CACGTGCCGCACGGGGCCG7G7A7CCGGAA.7 G.-.G7777 GC 7 37GA7G --7GGG . CGAC7G 7A.7 C G-^C CACAGCC=»7GA.GCTC 1 2 SO 
• * 

14 7 5 AAC 7 GC AG 7 7G C GAC G C CG G CCAC CAGT TCAC GT GCAACHA.CAAG 7 7 77 G7A.-.G7CCC7 3 7 7 C7 G 7- 7-7 C7G C G.-.C-.G7 G 7 G.A-.C G.-.C7 G CGGAGACA^CA 157 4 

1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : :: i i i i 1 1 1 1 1 1 i 1 1 ; 1 1 1 1 M 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 m t r 1 1 1 

•251 AACTGCAG77GCG.-.CGCCGGCCA.CC\CTTCArCTTGC^a^GCAiGT777G7--A.G. . . C 7 C77C7 3 7-G7C 7GCi-.^--^7 G7C-A.CG.-.G 7 G C GGA GACAAC^ 1377 



CCC7AC7CCG77CA7G:CCGCTGCCAG7GGGCCC7GCGGGGGa-.CGCCGAC 7S ■ 

: : : I I ?i ' T "| I I I i : : i ! M I t ! M I M I ! t i i I I I I I Mil 

r37---CC---7CA7G77CGC7GCC-.G7GGGC. . . .7GCCGGGACG.CGAC 532 
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1S75 GCCACCAGCACGGC7CCA.G77CTCCGG - CCCAGACCTTCASGTGTTCCAATGGOAACTG^^ 1 613 

I I I | I I f I I I I I I III! I I II III I I I M I II I I I I I I I I 1 I I I I I I I M I I I I I I II I I I I I ' I 1 1 HI 1 . HI HI! 1 1 ' 1 M 1 1 1 1 1 1 1 ' ' ' 1 1 1 
1378 CCGAC GAG CAGGGT TCCA T7T GTCCGGACC CAGACCTT CAGG7 G T 7 C CAA7 GGGAASTGC C7 C7 CGAAAAG CCAG CA« . Zt 



.Z CCAGCAC7 G CAA7GGSAAGGACCAC7 GT G 1<T 



1674 GGCACGGGTCCGACGAGGCCTCCTGCCCCAAGGTGAACCTCG7CAC^^ 1773 

1 1 1 1 1 1 ii i it 1 1 1 1 1 1 1 1 if 1 1 1 n i n i i M 1 1 1 1 1 1 1 m m 1 1 1 1 1 1 ii M ii 1 1 1 1 in 1 1 1 1 1 1 1 1 1 1 llillllllllllUlllL-lllLll I 

U 79 GGGACGGG7CCGACa^GGCC7CC7GCCCCAAGGTGAACCTCCTCA CTTGTACCAAACACACC7ACCGCTGCC* C.-.-.i G-.*- . C7«CT7G.-£C rt .^GGGO-.n 1577 



17 74 CCCTG* GTGTGACGGGAAGGAGGAC7C7AGCGACGGCTCA;LnTGAGAAGGA^ 1 3 7 3 

, lt I I I , t i I I I II I I I I II I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I M I I I II I I I I I M I M I I I I I I I I I t t II t I I 

1 5 7 e CCCTG* G7~G7GACGGG^AGGAGGACrGTAGCGACCGCTCAGA7 GAGAAG GAC7GCGAC7GTGGC CT GCG G7CATTCACGA GACAGGC7CGTGT7GTTG GG 167 7 



;j 7 < GGC^GGAtGCCG^7CAGGCC£\GTGGCCC7GCCAGC7AAGCC7CCATCC7C7GC^CCAGC^ 1 9? 3 

I I I I I I M I 1 1 I I M I II I I I I I M I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I 
1678 CGCACGGA7GCCGA7GACGGCGVrrGGCCC7GGCACGTAAGCCTGCA :7GC7C7GGGCCAGGCCCtC^7C7GCGG7C-C7TCCC7Ci 7C7C7CCCAACTGGC 17 7 7 

1 '-7 4 T G GTC 7 C7GC CGCACACT GC T ACAT CGA7GACAGAG GAT7C AGG7A.CT CAGACCCCACGCAG 7 GGACGC-C C77 CC7GG GC 7 TGCA.CGAC CAGAGCCAG CG 207 3 

1 1 | | r | 1 1 i | | if | | | | ii i | | | it | | i | i | | I I I I l l 1 I I I I I I I I I l l I I 1 1 I I I I I l I i i | i | I I I 1 I I 1 1 I I I ! 1 1 I M I I M M I I I I 
1779 TGG7C7C7GCCGCACAC7GC7ACATCG.a?C-.CA.^GC1^77CAGG7AC7C^.C-XCCC^GCA. . GGACGGCC77 C C7 G GC- C7 7G ZACGACCAGAGCCAGCG 1675 

207 4 CAGCCCCCC7GGGCTCCA.C^-.GCGCAGGC7CAA.GCGC a .^ 2173 

II ] | m 1 1 | | | | | || | | | | | | | | | I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I M I I I I l I I M I I I I I M l I M I 1 1 I I tl I I I I I I I I I 

• 6 7 6 CA. . C<-CCC7GCGGTGCAGG i .GCGCAGGC7CAAGCGCATCA^^ 1973 




2 37 4 GJkCAC-.CCCA.GTATGGAGGCACTGGCGCGCTG^TCCTGCAAAAGG 237 3 

M I I I I I I I i I I I I I f I I I I I I It t I I I I I I ! 1 I 1 t I I I I I I M i I I I I I I : I I I U t I 1 I I I f I II M I l i I t 1 I I ( I i I i II I I I I I I I I I M I I I I I 

2 07 4 GACA CA.CC CAG T AT GGAG GCAC7 GGCG CG C7 GAT CC7 G CAAAAG GG7 GAGVT CC G GG7CA7 CAACC^G 1 . CCACC7GC GAGAA.C C7 CC7 GC CGCAGCAGA7 217 3 

2 37 4 CirGCCGCGCA7aA7G7GCG7GGGCT7CC7CAGCCCCGC-CG7GGAC7^ 2 4 73 
I ( | || | | || i| I) | | | | | | || | | | | || I I II I I I I I II I M I I I I I I II I I M I I I I I I I I II I I I I I I M I I I t I I I ! I I I I M I I I I I I I I I I I I I I I 



217 4 CACGCCGCC^ATGATGTGCGTGCCCTTCC7CAGCGGCGGCG7GG-.C7CC7C<CAC^^ 



7GGAGGCGGATGGGCGG 227 3 




257 4 AGAACAC7 G G GG7 A7 AGGGGCCGGGGC CACC CAAA7G7 G7ACAC C7GCGGG GCCAXCC 1 .7CG7CCACCCCACTG 7GC-.C G C C7 GCAGCC7GGAGAC7 ... 2670 

I I I I I I M I I M t I M I I I I 11 I I I I I I I I I 1 I I I I I I I M II I f I I I I I I I I I I I f I I I I I I I II I I I I M I I I I I I : I II I I I I I I I I I I I 1 I I I 

2313 AGAACAC7GGGGTATAGG GG CC GGG GCC^CCCAAA7 G7 G7ACACC7GCGGGGCCACCCA7CG7 CCACCC GAG 7 G7GGAI GC C7GCAGGC7GGAGAC7CGC 2 < 7 2 

2671 CGACCGC7GAC7GCACCAGCGCCCCCACAACA7Att 2*7 "70 

t mi tiiiiiiitii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iiitiin 1 1 1 ; i 1 1 1 1 1 i t ; : 1 1 1 1 1 in i 1 1 1 1 1 1 1 1 1 

24 7 3 GCACCG7GACC7GCACCAGCG . CCCCA.GAACAT ACACT GTGAAC7 C . A7C7CCAGG . . C7CAAA7CTG . C7AGAAAACC7 CT C GCTTCw * CAGCCTCCAA 25 67 

2 771 AGTGGAGCTGGGA . GG7AGAAGGGGAGG . AGACTGGTGGTT CTACT CA£ CCAAC7GGGGGCAAAGG7T7 GAAGACACAG C C T CC CCCGCCAGCCCCAAGC 2368 

MM Ml Ml I II I I I I I I I I I I I M I I I I I I I I I I I M I II M I I M I I M M II I IIMIMMM IMM I Ml II lilt I 

2569 AGTGGAG C 7 G GGAG KT AGAAGGGGAGGAACACT GG7 GG7 7 C7 AC7GAC C CAAC7GGGG . . CAAGG777GAAG . C-.CAG CTCCGGCAGCCC . . AAG 2 653 



2669 TGGGCCGAG G C G C GTT7 G7 G7 A7 A7CT GCC7 CCC CTGT C 7 G TAA G GAG CA.G GGGGA-AC GGAGC77CGGLAG ! 

I II I I ! I I I I I I I M 1 • I I I Mill I I I I I IMI I I IM 
2 65 9 7GGGCGAGGACGCGT77GTGCATA . CTGCC - CT GCTCT A7 ACAC GGAAGACC7 GGA 



1 CC7CAG7 GA. 
M (Mil 
. 7C7C7AG73A. 



;G7GGTGGGGCTGCCGG 2r£s 

111 I I I I I I I I 
. . . G7GTGACTGCCGG 2735 



2 969 ATCT GGGC7 GT GGGGCC C77 GGGCCACG C7 C7T GAGGAAGCCCAGGC7 C GGAGGAC CC 7 GGAAAAG-.GACGGG7 C7 GAG-.C7 GAAA7TGT7TT ACCAGCT 3066 

MMII II || IIMM I M I II I I I I M I I I I I M I I M II M M M I M II I I I t I I M II I It M M I t M I 11 IIMMIIM 

273 6 A7CTGG . . . C7GTGG7CC77GGCCA CCCTTCTTGAGGAAGCCCACGC7CGGASGACCC7GGAAAAGAGACGGG7 C7GAGAG7GAAAA7GGTTTACCAGCT 2832 



;o CSS* XO UOiO 



3069 CCG-.GGG 7 GGAC 77 CAG 7G7 GT G7ATTTGT G7AAAT GCGTAAAACAAT7 7 AT 77C 1 1 . 

Ill Ml Ml II I IMM IMM I I Mill I Mt Mil II II I I M M I II I M M M It I I 11 I I _ . - 

2633 CCCAGG . . 7 GACTT GAG7 GTG7GTA . 77 G7 G TA-AA7 GAG7AAAACA7 7 7 7 A7 7 7C77777 AAAAAAAAAAA. 2 r 0 u \* i> fcl*. X TQ 9JQ . H ) 
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SEQUENCE LISTING 

<110> O'Brien, Timothy J. 

Tanimoto, Hirotoshi 

<120> TADG-15: An Extracellular Serine Protease 

Over expressed in Breast and Ovarian Carcinomas 

<13 0> D6064PCT 

<140> PCT/US99/03436 

<141> 1999-02-18 

<150> US 09/027,337 

<151> 1998-02-20 

<160> 14 

<210> 1 
<211> 3147. 



<212> DNA 

<213> Homo sapiens 

<220> 

<223> . cDNA sequence of TADG-15 

<400> 1 

tcaagagcgg cctcggggta ccatggggag cgatcgggcc cgcaagggcg 50 

gagggggccc gaaggacttc ggcgcgggac tcaagtacaa ctcccggcac 100 

gagaaagtga atggcttgga ggaaggcgtg gagttcctgc cagtcaacaa 150 

cgtcaagaag gtggaaaagc atggcccggg gcgctgggtg gtgctggcag 2 00 

ccgtgctgat cggcctcctc ttggtcttgc tggggatcgg cttcctggtg 250 

tggc.atttgc agtaccggga cgtgcgtgtc cagaaggtct tcaatggcta 3 00 

catgaggatc acaaatgaga attttgtgga tgcctacgag aactccaact 3 50 

ccactgagtt tgtaagcctg gccagcaagg tgaaggacgc gctgaagctg 400 

ctgtacagcg gagtcccatt cctgggcccc taccacaagg agtcggctgt 450 

gacggccttc agcgagggca gcgtcatcgc ctactactgg tctgagttca 500 

gcatcccgca gcacctggtg gaggaggccg agcgcgtcat ggccgaggag 550 

cgcgtagtca tgctgccccc gcgggcgcgc tccctgaagt cctttgtggt 600 

cacctcagtg gtggctttcc ccacggactc caaaacagta cagaggaccc 650 

aggacaacag ctgcagcttt ggcctgcacg cccgcggtgt ggagctgatg 7 00 

cgcttcacca cgcccggctt ccctgacagc ccctaccccg ctcatgcccg 750 

ctgccagtgg gccctgcggg gggacgccga ctcagtgctg agcctcacct 800 

tccgcagctt tgaccttgcg tcctgcgacg agcgcggcag cgacctggtg 850 

acggtgtaca acaccctgag ccccatggag ccccacgccc tggtgcagtt 900 

gtgtggcacc taccctccct cctacaacct gaccttccac tcctcccaga 950 
acgtcctgct catcacactg ataaccaaca ctgagcggcg gcatcccggc 1000 
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tttgaggcca ccttcttcca gctgcctagg atgagcagct gtggaggccg 1050 
cttacgtaaa gcccagggga cattcaacag cccctactac ccaggccact 1100 
acccacccaa cattgactgc acatggaaca ttgaggtgcc caacaaccag 1150. 
catgtgaagg tgagcttcaa attcttctac ctgctggagc ccggcgtgcc 1200 
tgcgggcacc tgccccaagg actacgtgga gatcaatggg gagaaatact 1250 
^gcggagagag gtcccagttc gtcgtcacca gcaacagcaa caagatcaca 13 00 
gttcgcttcc actcagatca gtcctacacc gacaccggct tcttagctga 13 50 
atacctctcc tacgactcca gtgacccatg cccggggcag ttcacgtgcc 1400 
gcacggggcg gtgtatccgg aaggagctgc gctgtgatgg ctgggccgac 1450 
tgcaccgacc acagcgatga gctcaactgc agttgcgacg ccggccacca 1500 
gttcacgtgc aagaacaagt tctgcaagcc cctcttctgg gtctgcgaca 1550 
gtgtgaacga ctgcggagac aacagcgacg agcaggggtg cagttgtccg 1600 
gcccagacct tcaggtgttc caatgggaag tgcctctcga aaagccagca 1650 
gtgcaatggg aaggacgact gtggggacgg gtccgacgag gcctcctgcc 1700 
ccaaggtgaa cgtcgtcact tgtaccaaac acacctaccg ctgcctcaat 1750 
gggccctgct tgagcaaggg caaccctgag tgtgacggga aggaggactg 1800 
cagcgacggc tcagatgaga aggactgcga ctgtgggctg cggtcattca 1850 
cgagacaggc tcgtgttgtt gggggcacgg atgcggatga gggcgagtgg 1900 
ccctggcagg taagcctgca tgctctgggc cagggccaca tctgcggtgc 1950 
ttccctcatc tctcccaact ggctggtctc tgccgcacac tgctacatcg 2000 
atgacagagg attcaggtac tcagacccca cgcagtggac ggccttcctg 2050 
ggcttgcacg accagagcca gcgcagcgcc cctggggtgc aggagcgcag 2100 
gctcaagcgc atcatctccc accccttctt caatgacttc accttcgact 2150 
atgacatcgc gctgctggag ctggagaaac cggcagagta cagctccatg 2200 
gtgcggccca tctgcctgcc ggacgcctcc catgtcttcc ctgccggcaa 2250 
ggccatctgg gtcacgggct ggggacacac ccagtatgga ggcactggcg 2300 
cgctgatcct gcaaaagggt gagatccgcg tcatcaacca gaccacctgc 2350 
gagaacctcc tgccgcagca gatcacgccg cgcatgatgt gcgtgggctt 2400 
cctcagcggc ggcgtggact cctgccaggg tgattccggg ggacccctgt 2450 
ccagcgtgga ggcggatggg cggatcttcc aggccggtgt ggtgagctgg 2500 
ggagacggct gcgctcagag gaacaagcca ggcgtgtaca caaggctccc 2550 
tctgtttcgg gactggatca aagagaacac tggggtatag gggccggggc .2600 
cacccaaatg tgtacacctg cggggccacc catcgtccac cccagtgtgc 2650 
acgcctgcag gctggagact ggaccgctga ctgcaccagc gcccccagaa 27 00- 
catacactgt gaactcaatc tccagggctc caaatctgcc tagaaaacct 27 50 
ctcgcttcct cagcctccaa agtggagctg ggaggtagaa ggggaggaca 2800 
ctggtggttc tactgaccca actgggggca aaggtttgaa gacacagcct 2850 
cccccgccag ccccaagctg ggccgaggcg cgtttgtgta tatctgcctc 2900 

SEQ 2/17 
SUBSTITUTE SHEET (RULE 26) 



WO 99/42120 



PCT/US99/03436 



ccctgtctgt aaggagcagc gggaacggag cttcggagcc tcctcagtga 2950 

aggtggtggg gctgccggat ctgggctgtg gggcccttgg gccacgctct 3000 

tgaggaagcc caggctcgga ggaccctgga aaacagacgg gtctgagact 3050 

gaaattgttt taccagctcc cagggtggac ttcagtgtgt gtatttgtgt 3100 

aaatgggtaa aacaatttat ttctttttaa aaaaaaaaaa aaaaaaa 3147 

<210> 2 

<211> 855 

<212> PRT 

<213> Homo sapiens 



<220> 
<223> 



Amino acid sequence of TADG-15 encoded by cDNA 
<400> 2 

Met Gly Ser Asp Arg Ala Arg Lys Gly Gly Gly Gly Pro Lys Asp 

5 10 15 

Phe Gly Ala Glv Leu Lys Tyr Asn Ser Arg His Glu Lys Val Asn 



,e 30 

20 25 



Gly Leu Glu Glu Gly Val Glu Phe Leu Pro Val Asn Asn Val Lys 

35 40 45 

Lys Val Glu Lys His Gly Pro Gly Arg Trp Val Val Leu Ala Ala 



50 



55 



60 



Val Leu He Gly Leu Leu Leu Val Leu Leu Gly He Gly Phe Leu 

65 7 °- 75 

Val Trp His Leu Gin Tyr Arg Asp Val Arg Val Gin Lys Val Phe 

80 85 90 

Asn Gly. Tyr Met Arg He Thr Asn Glu Asn Phe Val Asp Ala Tyr 

95 100 105 

Glu Asn Ser Asn Ser Thr Glu Phe Val Ser Leu Ala Ser Lys Val 

110 H5 120 

Lys Asp Ala Leu Lys Leu Leu Tyr Ser Gly Val Pro Phe Leu Gly 

■125 130 135 

Pro Tyr His Lys Glu Ser Ala Val Thr Ala Phe Ser Glu Gly Ser 

• 140 145 150 

Val He Ala Tyr Tyr Trp Ser Glu Phe Ser He Pro Gin His Leu 

155 160 . I 65 

Val Glu Glu Ala Glu Arg Val Met Ala Glu Glu Arg Val Val Met 

170 175 18° 



SEQ 3/17 
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Leu Pro Pro Arg Ala Arg Ser Leu Lys Ser Phe Val Val Thr Ser 

185 190 195 

Val Val Ala Phe Pro Thr Asp Ser Lys Thr Val Gin Arg Thr Gin 

200 205 210 

Asp Asn Ser Cys Ser Phe Gly Leu His Ala Arg Gly Val Glu Leu 

215 220 225 

Met Arg Phe Thr Thr Pro Gly Phe Pro Asp Ser Pro Tyr Pro Ala 

230 235 240 

His Ala Arg Cys Gin Trp Ala Leu Arg Gly Asp Ala Asp Ser Val 

245 250 255 

Leu Ser Leu Thr Phe Arg Ser Phe Asp Leu Ala Ser Cys Asp Glu 

260 265 270 

Arg Gly Ser Asp Leu Val Thr Val Tyr Asn Thr Leu Ser Pro Met 

275 280 285 

Glu Pro His Ala Leu Val Gin Leu Cys Gly Thr Tyr Pro Pro Ser . 

290 295 300 

Tyr Asn Leu Thr Phe His Ser Ser Gin Asn Val Leu Leu lie Thr 

305 310 315 

Leu lie Thr Asn Thr Glu Arg Arg His Pro Gly Phe Glu Ala Thr 

320 325 330 

Phe Phe Gin Leu Pro Arg Met Ser Ser Cys Gly Gly Arg Leu Arg 

335 340 345 

Lys Ala Gin Gly Thr Phe Asn Ser Pro Tyr Tyr Pro Gly His Tyr 

350 355 360 

Pro Pro Asn lie Asp Cys Thr Trp Asn lie Glu Val Pro Asn Asn 

365 370 375 

Gin His Val Lys Val Ser Phe Lys Phe Phe Tyr Leu Leu Glu Pro 

380 385 390 

Gly Val Pro Ala Gly Thr Cys Pro Lys Asp Tyr Val Glu lie Asn - 

395 400 405 

Gly Glu Lys Tyr Cys Gly Glu Arg Ser Gin Phe Val Val Thr Ser 

410 415 420 

Asn Ser Asn Lys lie Thr Val Arg Phe His Ser Asp Gin Ser Tyr 

425 430 435 

Thr Asp Thr Gly Phe Leu Ala Glu Tyr Leu Ser Tyr Asp Ser Ser 

440 445 450 

Asp Pro Cys Pro Gly Gin Phe Thr Cys Arg Thr Gly Arg Cys lie 

455 460 465 

SEQ 4/17 
SUBSTITUTE SHEET (RULE 25) 



PCT/US99/03436 

WO 99/42120 

Arg Lys Glu Leu Arg Cys Asp Gly Trp Ala Asp Cys Thr Asp His 

470 475 480 

Ser Asp Glu Leu Asn Cys Ser Cys Asp Ala Gly His Gin Phe Thr 

485 490 495 

Cys Lys Asn Lys Phe Cys Lys Pro Leu Phe Trp Val Cys Asp Ser 

500 505 510 

Val Asn Asp Cys Gly Asp Asn Ser Asp Glu Gin Gly Cys Ser Cys " 

515 520 525 

Pro Ala Gin Thr Phe Arg Cys Ser Asn Gly Lys Cys Leu Ser Lys 

530 535 540 

Ser Gin Gin Cys Asn Gly Lys Asp Asp Cys Gly Asp Gly Ser Asp 

545 550 555 

Glu Ala Ser Cys Pro Lys Val Asn Val Val Thr Cys Thr Lys His 

560 565 570 

Thr Tyr Arg Cys Leu Asn Gly Leu Cys Leu Ser Lys Gly Asn Pro 

575 580 585 

Glu Cys Asp Gly Lys Glu Asp Cys Ser Asp^Gly Ser Asp Glu Lys 

590 595 600 

Asp Cys Asp Cys Gly Leu Arg Ser Phe Thr Arg Gin Ala Arg Val 

605 61° 615 

Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp Gin Val 

620 625 630 

Ser Leu His Ala Leu Gly Gin Gly His He Cys Gly Ala Ser Leu 

635 640 645 

He Ser Pro Asn Trp Leu Val Ser Ala Ala His Cys Tyr He Asp 

650 655 660 

Asp Arg Gly Phe Arg Tyr Ser Asp Pro Thr Gin Trp Thr Ala Phe 

665 670 675 

Leu Gly Leu His Asp Gin Ser Gin Arg Ser Ala Pro Gly Val Gin 

680 685 690 

Glu Arg Arg Leu Lys Arg He He Ser His Pro Phe Phe Asn Asp 

695 700 705 

Phe Thr Phe Asp Tyr Asp He Ala Leu Leu Glu Leu Glu Lys Pro 

710 715 7 20 

Ala Glu Tyr Ser Ser Met Val Arg Pro He Cys Leu Pro Asp Ala 

725 7 30 735 

Ser His Val Phe Pro Ala Gly Lys Ala He Trp Val Thr Gly Trp 

740 



745 750 



SEQ 5/17 

SUBSTITUTE SHEET (RULE 26) 
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Gly His Thr Gin Tyr Gly Gly Thr Gly Ala Leu lie Leu Gin Lys 

755 760 765 

Gly Glu lie Arg Val lie Asn Gin Thr Thr Cys Glu Asn Leu Leu 

770 775 780 

Pro Gin Gin lie Thr Pro Arg Met Met Cys Val Gly Phe Leu Ser 

785 790 795 

Gly Gly Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Ser 

800 805 810 

Ser Val Glu Ala Asp Gly Arg lie Phe Gin Ala Gly Val Val Ser 

815 820 825 

Trp Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly Val Tyr Thr 

830 835 840 

Arg Leu Pro Leu Phe Arg Asp Trp lie Lys Glu Asn Thr Gly Val 

845 850 855 

<210> 3 
<211> 256 
<212> PRT 
<213> Unknown 
<220> 

<221> DOMAIN 

<223> Serine protease catalytic domain of hepsin (Heps) 

homologous to similar domain in TADG-15 
<400> 3 

Arg lie Val Gly Gly Arg Asp Thr Ser Leu Gly Arg Trp Pro Trp 

5 10 15 

■ 

Gin Val Ser Leu Arg Tyr Asp Gly Ala His Leu Cys Gly Gly Ser 

20 25 30 

Leu Leu Ser Gly Asp Trp Val Leu Thr Ala Ala His Cys Phe Pro 

35 40 45 

Glu Arg Asn Arg Val Leu Ser Arg Trp Arg Val Phe Ala Gly Ala 

50 55 60 

Val Ala Gin Ala Ser Pro His Gly Leu Gin Leu Gly Val Gin Ala 

65 70 75 

Val Val Tyr His Gly Gly Tyr Leu Pro Phe Arg Asp Pro Asn Ser 

80 85 90 

Glu Glu Asn Ser Asn Asp lie Ala Leu Val His Leu Ser Ser Pro 

95 100 105 



SEQ 6/17 

SUBSTITUTE SHEET (RULE 26) 
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Leu Pro Leu Thr Glu Tyr lie Gin Pro Val Cys Leu Pro Ala Ala 

110 115 - 120 

Gly Gin Ala Leu Val Asp Gly Lys He Cys Thr Val Thr Gly Trp 

125 130 135 

Gly Asn Thr Gin Tyr Tyr Gly Gin Gin Ala Gly Val Leu Gin Glu 

140 145 150 

Ala Arg Val Pro He He Ser Asn Asp Val Cys Asn Gly Ala Asp 

155 160 165 

Phe Tyr Gly Asn Gin He Lys Pro Lys Met Phe Cys Ala Gly Tyr 

170 175 180 

Pro Glu Gly Gly He Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro 

185 190 195 

Phe Val Cys Glu Asp Ser He Ser Arg Thr Pro Arg Trp Arg Leu 

200 205 210 

Cys Gly He Val Ser Trp Gly Thr Gly Cys Ala Leu Ala Gin Lys 

215 220 225 

Pro Gly Val Tyr Thr Lys Val Ser Asp Phe Arg Glu Trp He Phe 

230 235 240 

Gin Ala He Lys Thr His Ser Glu Ala Ser Gly Met Val Thr Gin 

245 250 255 

Leu 

<210> .4 
<211> 225 
<212> PRT 
< 2 1 3 > Unknown 
<220> 

<221> DOMAIN 

<223> Serine protease catalytic domain of Scce 

homologous to similar domain in TADG-15. 
<400> 4 

Lys He He Asp 'Gly Ala Pro Cys Ala Arg Gly Ser His Pro Trp 

. - 5 10 15 

Gin Val Ala Leu Leu Ser Gly Asn Gin Leu His Cys Gly Gly Val 

20 25 30 

Leu Val Asn Glu Arg Trp Val Leu Thr Ala Ala His Cys Lys Met 

35 40 45 



SEQ 7/17 
SUBSTITUTE SHEET (RULE 25) 
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Asn Glu Tyr Thr Val His Leu Gly Ser Asp Thr Leu Gly Asp Arg 

50 55 60 

Arg Ala Gin Arg lie Lys Ala Ser Lys Ser Phe Arg His Pro Gly 

65 70 75 

Tyr Ser Thr Gin Thr His Val Asn Asp Leu Met Leu Val Lys Leu 

8 0 85 90 

Asn Ser Gin Ala Arg Leu Ser Ser Met Val Lys Lys Val Arg Leu 

95 100 105 

Pro Ser Arg Cys Glu Pro Pro Gly Thr Thr Cys Thr Val Ser Gly 

110 115 120 

Trp Gly Thr Thr Thr Ser Pro Asp Val Thr Phe Pro Ser Asp Leu 

.125* 130 135 

Met Cys Val Asp Val Lys Leu lie Ser Pro Gin Asp Cys Thr Lys 

140 145 150 

Val Tyr Lys Asp Leu Leu Glu Asn Ser Met Leu Cys Ala Gly lie 

155 160 165 

Pro Asp Ser Lys Lys Asn Ala Cys Asn Gly Asp Ser Gly Gly Pro 

170 175 180 

Leu Val Cys Arg Gly Thr Leu Gin Gly Leu Val Ser Trp Gly Thr 

185 190 195 

Phe Pro Cys Gly Gin Pro Asn Asp Pro Gly Val Tyr Thr Gin Val 

200 205 210 

Cys Lys Phe Thr Lys Trp lie Asn Asp Thr Met Lys Lys His Arg 

215 220 225 

<210> 5 
<211> 225 
<212> PRT 
< 2 13 > Unknown 
<220> 

<221> DOMAIN 

<223> Serine protease catalytic domain of trypsin 

(Try) homologous to similar domain in TADG-15 
<400>- 5 

Lys lie Val Gly Gly Tyr Asn Cys Glu Glu Asn Ser Val Pro Tyr 

5 10 - 15 

Gin Val Ser Leu Asn Ser Gly Tyr His Phe Cys Gly Gly Ser Leu 

20 25 30 



SEQ 8/17 
SUBSTITUTE SHEET (RULE 26) 
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lie Asn Glu Gin Trp Val Val Ser Ala Gly His Cys Tyr Lys Ser 

35 40 45 

Arg lie Gin Val Arg Leu Gly Glu His Asn lie Glu Val Leu Glu 

50 55 60 

. Gly. Asn Glu Gin Phe lie Asn Ala Ala Lys lie lie Arg His Pro 

65 70 75 

Gin Tyr Asp Arg Lys Thr Leu Asn Asn Asp lie Met Leu lie Lys 

80 85 90 

Leu Ser Ser Arg Ala Val lie Asn Ala. Arg Val Ser Thr lie Ser 

95 100 105 

Leu Pro Thr Ala Pro Pro Ala Thr Gly Thr Lys Cys Leu lie Ser 

110 115 120* 

Gly Trp Gly Asn Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp Glu 

125 130 135 

Leu Gin Cys Leu Asp Ala Pro Val Leu Ser Gin Ala Lys Cys Glu 

140 145 150 

Ala Ser Tyr Pro Gly Lys lie Thr Ser Asn Met Phe Cys Val Gly 

155 160 165 

Phe Leu Glu Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly 

170 175 . 180 

Pro Val Val Cys Asn Gly Gin Leu Gin Gly Val Val Ser Trp Gly 

185 190 195 

Asp Gly Cys Ala Gin Lys Asn Lys Pro Gly Val Tyr Thr Lys Val 

.200 205 210 

Tyr Asn Tyr Val Lys Trp lie Lys Asn Thr lie Ala Ala Asn Ser 

220 225 





215 


<210> 


6 


<211> 


231 


<212> 


PRT 


<213> 


Unknown 


<220> 




<221> 


DOMAIN 


<223> 


Serine ■ 



(Chymb) homologous to' similar domain in TADG-15. 
<400> 6 

Arg lie Val Asn Gly Glu Asp Ala Val Pro Gly Ser Trp Pro Trp 

5 10 15 



. SEQ 9/17 . 

SUBSTITUTE SHEET (RULE 25) 
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Gin Val Ser Leu Gin Asp Lys Thr Gly Phe His Phe Cys Gly Gly 

20 25 30 

Ser Leu lie Ser Glu Asp Trp Val Val Thr Ala Ala His Cys Gly 

35 40 . 45 

Val Arg Thr Ser Asp Val Val Val Ala Gly Glu Phe Asp Gin Gly 

50 55 60 

Ser Asp Glu Glu Asn lie Gin Val Leu Lys lie Ala Lys Val Phe 

65 70 75 

Lys Asn Pro Lys Phe Ser lie Leu Thr Val Asn Asn Asp lie Thr 

80 85 90 

Leu Leu Lys Leu Ala Thr Pro Ala Arg Phe Ser Gin Thr Val Ser 

.95 100 105 

Ala Val Cys Leu Pro Ser Ala Asp Asp Asp Phe Pro Ala Gly Thr 

HO 115 120' 

Leu Cys Ala Thr Thr Gly Trp Gly Lys Thr Lys Tyr Asn Ala Asn 

125 130 135 

Lys Thr Pro Asp Lys Leu Gin Gin Ala Ala Leu Pro Leu Leu Ser 

140 145 150 

Asn Ala Glu Cys Lys Lys Ser Trp Gly Arg Arg lie Thr Asp Val 

155 160 165 

Met lie Cys Ala Gly Ala Ser Gly Val Ser Ser Cys Met Gly Asp 

170 175 180 

Ser Gly Gly Pro Leu Val Cys Gin Lys Asp Gly Ala Trp Thr Leu 

185 190 195 

Val Gly lie Val Ser Trp Gly Ser Asp Thr Cys Ser Thr Ser Ser 

200 205 210 

Pro Gly Val Tyr Ala Arg Val Thr Lys Leu lie Pro Trp Val Gin 

215 220. 225 

Lys lie Leu Ala Ala Asn 

230 

<210> 7 

<211> 255 

<212> PRT 

<213> Unknown 
<220> . 

<221> DOMAIN 

<223> Serine protease catalytic domain of factor 7 

(Fac7) homologous, to similar domain in TADG-15 



SEQ 10/17 
SUBSTITUTE SHEET (RULE 25) 
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<400> 



7 



Arg lie Val Gly Gly Lys Val Cys Pro Lys Gly Glu Cys Pro Trp 

5 10 15 

Gin Val Leu Leu Leu Val Asn Gly Ala Gin Leu Cys Gly Gly Thr 

20 25 30 

Leu lie Asn Thr lie Trp Val Val Ser Ala Ala His Cys Phe Asp 

35 40 .45 

Lys lie Lys Asn Trp Arg Asn Leu lie Ala Val Leu Gly Glu His 

50 55 60 

Asp Leu Ser Glu His Asp Gly Asp Glu Gin Ser Arg Arg Val Ala 

65 70 75 

Gin Val lie lie Pro Ser Thr Tyr Val Pro Gly Thr Thr Asn His 

80 85 90 

Asp lie Ala Leu Leu Arg Leu His Gin Pro Val Val Leu Thr Asp 

95 100 . 105 

His Val Val Pro Leu Cys Leu Pro Glu Arg Thr Phe Ser Glu Arg 

110 115 120 

Thr Leu Ala Phe Val Arg Phe Ser Leu Val Ser Gly Trp Gly Gin 

125 130 135 

Leu Leu Asp Arg Gly Ala Thr Ala Leu Glu Leu Met Val Leu Asn 

140 145 150 

Val Pro Arg Leu Met Thr Gin Asp Cys Leu Gin Gin Ser Arg Lys 

155 . 160 165 

Val Gly Asp Ser Pro Asn lie Thr Glu Tyr Met Phe Cys Ala Gly 

170 175 180 

Tyr Ser Asp Gly Ser Lys Asp Ser Cys Lys Gly Asp Ser Gly Gly 

185 190 195 

Pro His Ala Thr His Tyr Arg Gly Thr Trp Tyr Leu Thr Gly lie 

200 - 205 210 

Val Ser Trp Gly Gin Gly Cys Ala Thr Val Gly His Phe Gly Val 

215 220 225 

Tyr Thr Arg Val Ser Gin Tyr lie Glu Trp Leu Gin Lys Leu Met 

230 235 240 

Arg Ser Glu Pro Arg Pro Gly Val Leu Leu Arg Ala Pro Phe Pro 



245 



250 



255 



<210> 



8 



<211> 



253 



SEQ 11/17 
SUBSTITUTE SHEET (RULE 26) 
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<212> PRT 
<213> Unknown 
<220> 

<221> DOMAIN 

<223> Serine protease catalytic domain of tissue 

plasminogen activator (Tpa) homologous to 
similar domain in TADG-15. 

<400> 8 

Arg lie Lys Gly Gly Leu Phe Ala Asp lie Ala Ser His Pro Trp 

5 10 15" 

Gin Ala Ala lie Phe Ala Lys His Arg Arg Ser Pro Gly Glu Arg 

20 25 30 

Phe Leu Cys Gly Gly lie Leu lie Ser Ser Cys Trp lie Leu Ser 

35 40 45 

Ala Ala His Cys Phe Gin Glu Arg Phe Pro Pro His His Leu Thr 

50 55 60 

Val lie Leu Gly Arg Thr Tyr Arg Val Val Pro Gly Glu Glu Glu 

65 70 75 

Gin Lys Phe Glu Val Glu Lys Tyr lie Val His Lys Glu Phe Asp 

80 85 90 

Asp Asp Thr Tyr Asp Asn Asp lie Ala Leu Leu Gin Leu Lys Ser 

95 100 105 

Asp Ser Ser Arg Cys Ala Gin Glu Ser Ser Val Val Arg Thr Val 

110 115 120 

Cys Leu Pro Pro Ala Asp Leu Gin Leu Pro Asp Trp Thr Glu Cys 

125- 130 135 

Glu Leu Ser Gly Tyr Gly Lys His Glu Ala Leu Ser Pro Phe Tyr 

140 145 150 

Ser Glu Arg Leu Lys Glu Ala His Val Arg Leu Tyr Pro Ser Ser 

155 160 165 

Arg Cys Thr Ser Gin His Leu Leu Asn Arg Thr Val Thr Asp Asn 

170 175 180 

Met Leu Cys Ala Gly Asp Thr Arg Ser Gly Gly Pro Gin Ala Asn 

185 190 - 195 

Leu His Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro Leu Val Cys 

200 205 210 



SEQ 12/17 

SUBSTITUTE SHEET (RULE 26) 
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Leu Asn Asp Gly Arg Met Thr Leu Val Gly lie lie Ser Trp Gly 

215 220 225 

Leu Gly Cys Gly Gin Lys Asp Val Pro Gly Val Tyr Thr Lys Val 

230 235 240 

• Thr Asn Tyr Leu Asp Trp lie Arg Asp Asn Met Arg Pro 

245 250 

<210> 9 
v<211> 2900 
<212> DNA 
<213> Homo sapiens 

<220> 

<223> SNC19 mRNA sequence (U20428) 

<400> 9 

cgctgggtgg tgctggcagc cgtgctgatc ggcctcctct tggtcttgct 50 
ggggatcggc ttcctggtgt ggcatttgca gtaccgggac gtgcgtgtcc 100 
agaaggtctt caatg'gctac atgaggatca caaatgagaa ttttgtggat 150 
gcctacgaga actccaactc cactgagttt gtaagcctgg ccagcaaggt 200 
gaaggacgcg ctgaagctgc tgtacagcgg agtcccattc ctgggcccct 250 
accacaagga gtcggctgtg acggccttca gcgagggcag cgtcatcgcc 3 00 
tactactggt ctgagttcag catcccgcag cacctggttg aggaggccga 3 50 
gcgcgtcatg gccaggagcg cgtagtcatg ctgcccccgc gggcgcgctc 400 
cctgaagtcc tttgtggtca cctcagtggt ggctttcccc acggactcca 450 
aaacagtaca gaggacccag gacaacagct gcagc'tttgg cctgcacgcc 500 
gcggtgtgga gctgatgcgc ttcaccacgc cggcttccct gacagcccct 550 
accccgctca tgcccgctgc cagtgggctg cggggacgcg acgcagtgct 600 
gagctactcg agctgactcg cagcttgact gcgcctcgac gagcgcggca 650 
gcgacctggt gacgtgtaca acaccctgag ccccatggag ccccacgcct 700 
ggtgagtgtg tggcacctac cctccctcct acaacctgac cttccactcc 750 
ctcccacgaa cgtcctgctc atcacactga taaccaacac tgacgcggca 800 
tcccggcttt gaggccacct tcttccagct gcctaggatg agcagctgtg 850 
gaggccgett acgtaaagcc caggggacat tcaacagccc ctactaccca 900 
ggccactacc cacccaacat tgactgcaca tggaaaattg aggtgcccaa 950 
caaccagcat gtgaaggtgc gcttcaaatt cttctacctg ctggagcccg 1000 
gcgtgcctgc gggcacctgc cccaaggact acgtggagat caatggggag 1050 
aaatactgcg gagagaggtc ccagttcgtc gtcaccagca acagcaacaa 1100 
gatcacagtt cgcttccact cagatcagtc ctacaccgac accggcttct 1150 
tagctgaata cctctcctac gactccagtg acccatgccc ggggcagttc 1200 
acgtgccgca cggggcggtg tatccggaag gagctgcgct gtgatggctg 1250 



SEQ 13/17 

SUBSTITUTE SHEET (RULE 26) 
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ggcgactgca ccgaccacag cgatgagctc aactgcagtt gcgacgccgg 13 00 

ccaccagttc acgtgcaaga gcaagttctg caagctcttc tgggtctgcg 13 50 

acagtgtgaa cgagtgcgga gacaacagcg acgagcaggg ttgcatttgt 1400 

ccggacccag accttcaggt gttccaatgg gaagtgcctc tcgaaaagcc 1450 

agcagtgcaa tgggaaggac gactgtgggg acgggtccga cgaggcctcc 1500 

tgccccaagg tgaacgtcgt cacttgtacc aaacacacct accgctgcct 1550 

caatgggctc tgcttgagca agggcaaccc tgagtgtgac gggaaggagg 1600 

actgtagcga cggctcagat gagaaggact gcgactgtgg gctgcggtca 1650 

ttcacgagac aggctcgtgt tgttgggggc acggatgcgg atgagggcga 17 00 

gtggccctgg caggtaagcc tgcatgctct gggccagggc cacatctgcg 1750 

gtgcttccct catctctccc aactggctgg tctctgccgc acactgctac 1800 

atcgatgaca. gaggattcag gtactcagac cccacgcagg acggccttcc 1850 

tgggcttgca cgaccagagc cagcgcaggc cctggggtgc aggagcgcag 1900. 

gctcaagcgc atcatctccc accccttctt caatgacttc accttcgact 1950 

atgacatcgc gctgctggag ctggagaaac cggcagagta cagctccatg 2000 

gtgcggccca. tctgcctgcc ggacgcctgc catgtcttcc ctgccggcaa 2050 

ggccatctgg gtcacgggct ggggacacac ccagtatgga ggcactggcg 2100 

cgctgatcct gcaaaagggt gagatccgcg tcatcaacca gaccacctgc 2150 

gagaacctcc tgccgcagca gatcacgccg cgcatgatgt gcgtgggctt 2200 

cctcagcggc ggcgtggact cctgccaggg tgattccggg ggacccctgt 2250 

ccagcgtgga ggcggatggg cggatcttcc aggccggtgt ggtgagctgg 23 00 

ggagacgctg cgctcagagg aacaagccag gcgtgtacac aaggctccct 23 50 

ctgtttcggg aatggatcaa agagaacact ggggfcatagg ggccggggcc 24 00 

acccaaatgt gtacacctgc ggggccaccc atcgtccacc ccagtgtgca 2450 

cgcctgcagg ctggagactc gcgcaccgtg acctgcacca gcgccccaga 2500 

acatacactg tgaactcatc tccaggctca aatctgctag aaaacctctc 2550 

gcttcctcag cctccaaagt ggagctggga gggtagaagg ggaggaacac 2600 

tggtggttct actgacccaa ctggggcaag gtttgaagca cagctccggc 2650 

agcccaagtg ggcgaggacg cgtttgtgca tactgccctg ctctatacac 2700 

ggaagacctg gatctctagt gagtgtgact gccggatctg gctgtggtcc 27 50 

ttggccacgc ttcttgagga agcccaggct cggaggaccc tggaaaacag 2 800 

acgggtctga gactgaaaat ggtttaccag ctcccaggtg acttcagtgt 2850 
gtgtattgtg taaatgagta aaacatttta tttcttttta aaaaaaaaaa 2900 



<210> 



10 



<211> 



20 



<212> 



DMA 



<213> 



Artificial Sequence 



<220> 



SEQ 14/17 

SUBSTITUTE SHEET (RULE 26) 
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<221> primer 

<223> Forward primer for analysis of overexpression 

of TADG-15 mRMA by quantitative PCR. 
<400> 10 
atgacagagg attcaggtac 20 

<210> 11 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> primer 

<223> Reverse primer for analysis of overexpression 

of TADG-15 mRNA by quantitative PCR. 
<400> 11 
gaaggtgaag tcattgaaga 20 

<210> 12 

<211> 17 . 

<212> DNA 

<213> Artificial Sequence 
<220> 

<221> primer 

<223> Forward primer for analysis of (3-tubulin mRNA 

expression by quantitative PCR. 
<400> 12 
tgcattgaca acgaggc 17 

<210> 13 
<211> 17 
<212> DNA 

<213> Artificial Sequence 

<220> 

<221> primer 

<223> Reverse primer for analysis of tubulin mRNA 

expression by quantitative PCR. 
<400> 13 
ctgtcttgac attgttg 17 



SEQ 15/17 

SUBSTITUTE SHEET (RULE 26) 
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<210> 14 

<211> 242 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> DOMAIN 

<223> Serine protease catalytic domain of TADG-15. 

<400> 14 

Arg Val Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp 

5 10 15 

Gin Val Ser Leu His Ala Leu Gly Gin Gly His lie Cys Gly Ala 

20 25 30 

Ser Leu lie Ser Pro Asn Trp Leu Val Ser Ala Ala His Cys Tyr 

35 40 45 

lie Asp Asp Arg Gly Phe Arg Tyr Ser Asp Pro Thr Gin Trp Thr 

50 55 60 

Ala Phe Leu Gly Leu His Asp Gin Ser Gin Arg Ser Ala Pro Gly 

65 70 75 

Val Gin. Glu Arg Arg Leu Lys Arg lie lie Ser His Pro Phe Phe 

80 85 90 

Asn Asp Phe Thr Phe Asp Tyr Asp lie Ala Leu Leu Glu Leu Glu 

95 100 105 

Lys Pro Ala Glu Tyr Ser Ser Met Val Arg Pro lie Cys Leu Pro 

110 115 120 

Asp Ala Ser His Val Phe Pro Ala Gly Lys Ala lie Trp Val . Thr 

125 130 135 

Gly Trp Gly His Thr Gin Tyr Gly Gly Thr Gly Ala Leu He Leu 

140 145 150 

Gin Lys Gly Glu lie Arg Val He Asn Gin Thr Thr Cys Glu Asn 

155 160 165 

Leu Leu Pro Gin Gin He Thr Pro Arg Met Met Cys Val Gly Phe 

170 175 180 

Leu Ser Gly Gly Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro 

185 190 195 

Leu Ser Ser Val Glu Ala Asp Gly Arg He Phe Gin Ala Gly Val 

200 205 210 

Val Ser Trp Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly Val 

215 220 225 

SEQ 16/17 
SUBSTITUTE SHEET (RULE 26) 
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Tyr Thr Arg Leu Pro Leu Phe Arg,Asp Trp lie Lys Glu Asn Thr 

230 235 240 

Gly Val 
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MGS DRARKGG GGPKDPGfeCL KYMSRKEKVM GLEEGVBFLP VNMVKKVEkiil 
GPGRWVVLAA VLIKLLLVLL GJGFLVWHLO YRDVRVQKVF MSYKR.ITHEH 



FVUAYE-MS tHS TE FVSLASKV KDALKLl/YSfi VPFLGPYHKE SAVTAFSEGS 
VIAYYWSEFS J P QBLVEEAE RVMftEBRVVM LPPRARSLK3 FWrsWAFP 
rDSK'JVQKTQ DNSpSFGLHA RGVELMRETT PGt'PDSPYPA HftRCQWALRG 



DAOKVLSLTF RSFDLASCDB KGSDLVTVYN TLSPMEPHftL VQLCGTYPPS 
k^Tr^KSSQW VLLI TLI TNT ERRHP«FE,AT FFQT.PWSSC GGRLRKAQ<JT 
FN 5 PYYPGHY PPMIDCTWJU RVPNNOHVKV SFKFFYLLEP CVPAGTCPKD 



j Y V£ I KG KK Y C G SRS QFWT S N £ NK I TV R FH SDQSYTDTGF LAEYLSY 



DP 



~ PGQFTCR TGRC1RKE1.* C DGW A DC T DC I [SDEt . 1 N CSC DA GHQFTCKHKF 



CKPLFWVCDS VMDCGDH pDEl QCCSt: t>AQTF RCSNGKCL5K" SQQCNKKDDC 
G DG&DEftSCP KVNWn:TKH TYRCLNGLCL SKCHPLCUttK- EDCSDdSPEk 
DCDCCLRSPT RQARpVGGT D ADGG liW PWQ V S i ,H A LGQG H I CG A S L 1 S PNW 



LVSAJ^YTD DRGFRYSDPT QWTAFJJ3LHD Q5QRSAPGVQ UKRUCR1ISH 
PFFNDFTFDY ®IA.LLL:LKKP AEYSSKVRPI CLPDASJlVtr'P AGK A I W V TG W 
GHTQYGGTGA LlLQKGETRV IHQTTCENLl, PQQl TPRMMC VGFLSGGVDS 
COG D gGGPLS ftVEADGRIFO AGVVSWGUGC AQRMKPGVYT RL PL PR DM IK 
jENTC-yj { SEQ. ID NO : 2 ) 

i Conserved cysteine residue 



SEE 



rjXT 'j : Possible N-linked glycosylation site 
: Conserved SDE motif 
; Potential cleavage site 

: Conserved amino acids of catalytic triad H, D, S 



1. Cytoplasmic domain 

2. Transmembrane domain 

3. CUB repeat 

4. Li gand-bi nding repeat (class A motif) 
of L.DL receptor like domain 

5. Serine protease 
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3li7 csn^-io O 



SUBSTITUTE SHEET (HULE 25) 



WOSW42JJ0 



PCT/US99/0343G 



3ECJJEUCE LISTING 

<110> O'Brien, Timothy J. 

Tanimoto, Hirotos'Jii 

<120> TADG-15: An Extracellular Serine Protease 

Overezpreeeed in Breast and Ovsrian Carcinomas 

<140> PCT/US22v , 0343 6 

<141> 1339-02-1$ 

<150> US 03/627, 337 

<151> 1998-02-20 

<160> 14 

<2 10 > 1 

<211> 314? 

<212> ETTA 

13 > A T onio sapjL-sns 
<22 0 > 

<223> cBMA a^Juence of TAEG-15 

tcaagagcg-g cctcggggca cc atggggag cgatcgggcc cgcaagggcg 50 

gagggggcce gaaggacttc ggcgcgggac tcaagtac&A crtec-sggc&c IOC' 

gagaaagtgs - atggcttgga ggaaggcgtg gagttcctgc cagtcaacaa 150 

cgtcaagaag gtggaaaagc stggcccggg gegetgggtg gtgctggcag 200 

ccgcgotgat eggectcrtc ttggtcttgc tggggafccgg cttcctggtg 250 

tggcatttgc agtaccggga cgt^cgfcgtc cagasggtct tcaatggcta 3 00 

catg-agg^tc acaaatgaga attttgtgga tgcetaegag aactccaact 3 50 

ccactgagrt tgtaagsetg gccagc«44gg tgaaggacgc gctgaagctg 400 

ctgtacassog gagteccatt cctgggcccc taccacaagg agtcggctgt 450 

gacggccttcr" agcgagggca gcgbcatcgc ctactaccgg tctgagttoa 500 

gsatcccgca goacctggtg gaggaggecg agcgcgtcat ggccgaggag 550 

cgcgtagtca tgctgcccec gcgggcgcgc tccctgaagt cctttgtggt 6 00 

oacctcagtg gtggctttcc ccacggacte caaaacagta cagaggaccc 6 50 

Aggacaacag ctgcagcttt ggcctgcacg cccgcggtgt ggagctgatg 7 00 

cgctt caeca eg-2ccggctt ecctgacagc ccctaccccg cccatgcccg 750 

ctgccagtgg gccctgcggg gggaegcega cteagtgctg agcctcacct 800 

tccgcagct-t tg&ccttgcg Mctgcgacg sgcgcggcag cgacctggbg 850 

scggtgtaca ac^c cetgsg ectfeatggag • ccccacgcc-c tggtgcagtt 900 

gcgtggcacc taccctccct cctacaacct gaccttcrcacr tcctcccag-s 950 
acgtcc'tgct cateacactg staaccaaca ctgagcggcg gcatcccgge 10 00 



SEQ 1/17 
SUBSTITUTE BHEET (RULE 28) 



WO 99/12120 PCTVTJS W03J3d 

tttg-2tggc?ca ccttrttcca gc tgectagg atgagcagct gtggaggccg 1050 
cfctacgtaaa gcccagggga cattcascag cccctactac ccaggccact 1100 
acccacccaa csttgactgc acacggsaca ttgaggtgcc eaaceaccac? 1150 
catgtgaagg tg&gcttcaa attetbctac ctgctggagc cogs-gtgce 1200 
tgcgggeacc tgccccaagg actacgtgga gstcaatggg g-sgaaatact 1250 
gcggagagag gtcccagfctc gt-cgtcacca gcaacagcaa caagatcaca 13-00 
gttcgcttcc acc-cagat-ca gtcccacacc gacaccggct. tcfctagctga 1350 
atacctctcc tscgactcca gtgacc-eatg escggggcag tt-cacgtgcc 1400 
gcacggggcg gfcgtatccgg aaggagctgc gctgtgatgg ctgggccgae 1450 
tgci-ccgacc acagcgatga gcccaactgc agttgcgacg ccggccacca 1500 
gt-tcacgtgc aagaacaagt tctgcaagcc cctcttctgg gt-stgigac* 1550 
gtgtgaacga ctgcggagac aacagcgacg agcaggggtg cagttgtccg 1600 
gcccagacct "tcaggtgttc caatgggaag tgc etc toga aeagecagea 1650 
gtgcaatggg aaggacgaet gtggggacgg gtccgac : gag gcctoetgcc 1700 
ecaag-gtgaa cgtcgtcact tgtaccaaac acacctaccg ctgcctcaat 1750 
gggetctget tgagcaaggg caaccctgag tgtgacggga aggaggactg 1300 
tagegaegge tcagatgaga aggaetgega ctgtgggctg oggteattca 1650 
cgagacaggc tcgtgttgtt gggggcaegg acgeggatga gggcgagtgg IS 00 
ceetggeagg taag-s-stgea tgotetgggc cagggceaca tetgeggtgc IS* 50 
ttccctcatc tctcccaact ggctggtctc tgccgcacac tgctaeateg 2000 
atgacagagg atxeaggtae tcagacccca cgcagtggac ggccttcctg 2050 
ggcttgcacg accegsgcca gcgcagcgcc cccggggtgc aggagegcag 2100 
gefcesaagege atcitntcdd acccctt~tt caatgacttc accttcgact 2150 
at gaeatege getgetggag ctggagaaac eggcagagta cagctccatg 22 00 
<jtgc<g<gccca tctgcetgcc ggacgcctc-c catgtcttcc ctgccggcaa 22 50 
ggcsc=itcrtgg gtcaeggget ggggacacae ccagtatgga ggcaetggcg 22 DO 
cgctgatcct gcaaaagggt gagatccgeg tcatcaacca gaccacctgc 2 3 5 0 
9agaacct.ee tgccgcagca gatcaege eg cgcatgatgt gcgtgggctx 24 00 
cctcagcggc ggcgtggact cctgc : caggg tgattccggg ggacccctgt 24 50 
eeagcgtgga ggcggatggg cggatcttcc aggccggtgt ggtgagctgg 25 00 
ggaigacggct gegctcagag gaacaageca ggcgtgtaca oaaggctccc 25 50 
tetgtctegg gactggatca aagagaacae tggggtatag gggcegggge 26 00 
MW2cca«a±g" tgtacacctg cggggccacc catcgtccac cccagtgtgc 26 50 
acgcctgcag gctggagact ggaccgctga ctgeaccagc gccc?c*c*gaa 27 00 
catacactgt gaac tester tccagggctc caaatctgcc tagaaaacct 27 50 
ctcgcttcct cagcctccaa agtggagctg ggaggtagaa ggggaggaca 23 00 
ctggtggttc tactgaccca actgggggca aaggtttgaa gaeaeagest 2350 
cccccgccag cceesagctg ggcegaggcg cgtttgtgta tatctgcctc 29 DO 



SEQ 2717 

SUBSTITUTE SHEET (RULE 25) 



WO 



ccctgtctgt asggagcagc gg«aacg?ag ctteggagec tcctcagtga 2*50 

aggtggtg** 9«flccggac cc^ctgtg gggc^ttgg cjccacgctct ,.00 

tr^sgaaejee caggctcgga gpaccctgga eascagacgg gtctgagact vO.J 

glaattottt taccagctcc eagggtggac trcagtgtgt gtatttgtgt 3100 

aaatgggtaa aacaatttac ttctttttaa aaaaaaaaaa aaaaaaa 3147 



<210> 2 
<211> 

<212> PP-T 

<212-- Heme sapiens 

<223> Amino *cid SS ^n Q e of TAtG-15 encode* toy cDNA 

Met, Glv Ser Asp tea Ma Arg *y. Gly Gly Gly Gly Pro Lys Asp 

mm Gly Ala Gly Leu Lys Tyr A*a Ser Arg Hie Glu by a Vai Abb 

20 25 30 

Gly Leu Glu olu Gly val Giu Phe Leu Pro v*l Aan Asn val Lys 

35 45 

Lv-3 Val Glu Lys Hia Gly Pro Gly Arg Trp Val Val Leu Ala Ala 

5 60 



50 



V 



al Leu lie Gly Leu Leu Leu Val Leu Leu Gly He Gly Fh* L«u 

Val ixp Hie Leu Gin Tyr Arg Asp Val Arg Val Gin Lys . Val Phe 

SO « ,:s 
Asn Gly Tyr Met Arg He Thr Asn Glu A an «« Val Asp Ala Ty 

9,5 100 
Glu Asn Sar A»n Sex Thr Glu Phe Val S*r Leu Ala Ser Lys Val 

110 US 130 

Lvs Asp Ala Leu Lya Leu L*u Tyr S*r Gly Val Pro Phe Leu Gly 

130 13 5 



105 



12 



■>5 



Pro Tyx His Lys Glu Ser Ala Val Thr Ala Phe Ser Glu Gly Ser 

140 145 150 

Val 116 Ala' ?yr Tyx Trp Ser Glu Phe Ser lie Pro Gin His Leu 



155 1^0 15* 



Val Glu Glu Ala Glu Arg Val Met Ala Glu Glu Arg Val Val Met 

170 1" 



SEQ 3/17 . 

SUBSTITUTE SHEET fHULE 2Sl 



WO 9PV42120 



PCT7US9M&436 



Leu Fro Pro Arg Ala Arg Ser L^u Lys Ser Fhe Val Val Thr Ser 

IBS 190 135 

Val Val Ala Phe Pro Thr Aep Ser Lye Thr- Val Gin Arg Thr Gin 

200 205 210 

- Asp Asn Ser Cys £cr Phra Gly L±u His Ala Arg Gly Val Glu Leu 

.215 22D 225 

Met Arg Fhe Thr Thr Pro Gly Phe Pro A,5P Ser Pro Tyr Pro Ala 

230 235 24 0 

Kis Ala Arg Cye Gin Trp AL* Leu Arg Gly Asp Ala Asp Ser Val 

245 25D 255 

Leu Ser Leu Thr Phe Ar^ Ser Phe Asp Leu Ala Ser Cy£ Asp Glu 

250 2 65 27 D 

Arc; Gly Ser Asp Leu Val Thr Val Tyr Aen Thr Leu Ser Pro Met 

275 230 26$ 

Glu Pro Kis Ala Leu Val Gin Leu Cy& Gly Thr Tyr Pro Fro Ser 

230 2 35 300 

Tyr Asn Leu Thr Phe His Ser Ser Gin Asn Val Leu Leu ll& Thr 

305 310 315 

Lou He Thr Asn Thr Glu Arg Arg His Pre Gly Phs Glu Ala Thr 

320 325 330 

Phe Phe Gin Leu Pro Arg Met Ser Ser Cys Gly Gly Arg Leu Arg 

335 3d0 345 

Lys Ala Gin Gly Thr Phe Aon Ser Pro Tyr Tyx Pro Gly His Tyr: 

350 355 2*0 

Pro Pre- Asn lie Asp Cys Thr Trp Asn lie Glu Val Pro Abu Asn 

365 370 375 

Gin His Val Lys Val Ser Phe Lys Phe Fhe Tyr Leu Levi Glu Pro 

3 SO " 3S5 390 

Gly Val Par d Ale Gly Thx Cya Pit.-- Lye Asp Tyr Val Glu lie Asn 

335 4 00 405 

Gly Glu Lys Tyr Cys Gly Glu Arg S±r Gin Phe val Val Thr Ser 

410 415 420 

Asn Ser Asn Lys lie Thr Val Arg Fhe His Ser Asp Gin Ser Tyr 

425 430 435 

Thx Aap Thr Gly Phe Leu A A a Glu Tyr Leu Ser Tyr Asp Ser Ser 

440 445 450 

Asp Pro Cys Pro Gly Gin Phe Thx Cye Arg Thr Gly Arc Cys Xl*> 

455 d50 465 



SEQ 4/17 

SUBSTITUTE SHEET (RULE 25) 



WO 99/42I20 



PCT/us9wnaod 



Arg Ly£ Glu Leu Arg Cys Asp Gly Trp Ala Asp Cys Thx Asp His 

470 475 460 

S'er Asp Glu Leu Asn Cys Ser Cys Asp Ala Gly His Gin Phe Thr 

465 430 495 

Cys Lys Aan Lya Phe Cya Lys Pro Leu Phe Trp Val O/s Asp. Sex 
. 500 5 05 510 

Val Asn Asp Cys Gly Asp Asn Ser Asp Glu Gin Gly Cys Ser Cys 

515 520 525 

Pro Ala Gin Thr Phe Arg Cys Ser Asn Gly Lys Cys Leu Ser Lys 

S30 535 540 

Ser GItj Gin Cye Asn Gly Lys Asp- Aep Cy£ Gly Asp Gly Ser Asp 



Thr Tyr Arg Cys Leu Asn Gly Leu Cys Leu Ser Lys Gly Asn Pro 

575 520 535 

Glu Cys Asp Gly Lys Glu App Cy3 Ser Aep Gly Ser Aep Glu Lys 

590 595 600 

Asp Cys Asp Cys Gly Lea Arg Ser P'ji<* Thr Arg Gin Ala Arg v.slI 

605 £10 £15 

Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Pro Trp Gin Val 

620 625 63 0 

Ser Leu His Ala L€:u Gly Gin Gly His. He Cys Gly Ala Ser Leu 

635 540 645 

He Ser Pro Asn Trp Leu val Ser Ala Ala Kis Cys Tyr lie A^p 

650 $55 -sfo 

Asp Arg Gly Phe Arc? Tyr Ser Asp Pro Thr Gin Trp Thr Ala Phe 

665 670 695 

Leu Gly L>su His Asp Gin S«r Gin Arg Sex Ala Pro Gly Val Gin 

630 SE5 630 

Glu Arg Arg Leu Lys Arg lie lie Sex His Pro Phe Phe Aan Asp. 

695 700 705 

Phe TJir Phe Aep Tyr Asp He Ala Leu Leu Glu Leu Glu Lys Pro 

710 715 720 

Ala Glu Tyr ser Ser M*t Val Arg Pro He Cys- Leu Pro Asp Ala 



550 



Glu Ala. Ser Cys Pro Lys Val Asn V* 1 

560 



Val Thr Cys Thr Lys His 
555 570 




SEQ 5/17 
SUBSTITUTE SHEET (HULE 26) 



WO 99/42120 



FCT,U599;03436 



Gly His Thr Gin Tyr Gly Gly Thr Gly Ala. Leu lie Leu Gin Lys 

755 7b0 765 

Gly Glu lie Arg Val lie A3n Gin Thr Thr Cys Glu Asn Leu Leu 

770 775 730 

.Pro Gin Gin lie Thr Pro Arg Met Met Cys val Gly Phe Leu £er 

7$5 790 795 

Gly Gly Val -Asp Ser Cys Gin Gly Asp Ser Gly Gly Pre- Leu 5er 

800 EOS 610 

Ser Val Glu Ala Asp Gly Arg lie Phe Gin Ala Gly Val Val Ser 



S15 820 e 



Trp Gly Asp Gly Cys Ala Gin Arg Asn Lys Pro Gly Val Tyr Thr 

$30 835 &40 

Arg Leu Pro Leu Phe Arg Asp Trp lie Lys Glu Asn Tnx Gly Val 



S45 450 &5 



5 



. <210> 3 
<211> 256 
<212> PRT 
<213> Unknown 
<220> 

<221> DGMAir.j 

<223> Sexine protease catalytic d.^nuiin cf hep-sin {Heps} 

hcxmologouB to similar domain in TADG-15 
<400> 3 

Arg ile Val Gly Gly Arg Asp Thr Ser Leu Gly Arg Trp Pro Trp. 



5 10 15 

Gin Val Ser Leu Arg Tyr Asp Gly Ala His Leu Cys Gly Gly Ser 

2 0 25 3 0 

Leu Leu S^r Gly Asp Trp Val Leu Thr Aia Ala His Cys Phe Pro 



35 40 4 



5 



Glu Arg Asn Arg Val Leu Sex Axg Trp Arg Val Phe Ala Giy Ala 

50 55 60 

Val Ala Gin Ala Ser Pro Kis Gly Leu Gin Leu Gly Val Gin Ala 

65 70 75 

Val Val Tyr Hie Gly Gly Tyr Leu Pro Phe Arg Asp Fro A sr. Ser 

80 &S .90 

Glu Glu Asn Ser Asn Asp lie Ala Leu Val Hie Leu Ser Ser Pro 

95 100 105 



SEQ 6/17 
SUBSTITUTE SHEET (RULE 25) 



WO 99/42120 



PCT/US99,<D3436 



Leu Prg Lena Thx Glu Tyr II* Glrj Fro Val Cys Leu Pro Ala Ala 

110 liS 12 0 

Gly Gin Ala Deu val Asp Gly Lye lie Cys Thr V<*1 Thr Gly Trp 

125 130 135 

Gly Asn Thr Gin Tyr Tyr Gly Gin Gin Ala Gly Val Leu Gin Glu 

140 145 150 

Ala Arg Val Pro II* lie Ser Asn Abd Val Cys Asn Gly Ala Asp 

155 160 165 

Fhfe Tyr Gly Asn Gin He Lye Prs Lys Hat Fhe Cys Ala Gly Tyr 

170 175 ISO 

Pro Glu Gly G'ly He Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro 

185 190 195 

PUe Val Cys Glu Asp Ser He Ser Arg Thx- Pro Arg Trp Arg L*u 

200 205 2.10 

Cys Gly lie val Ser Trp Gly Thar Gly Cys Ala Leu Ala Gin Lye 

215 220 225 

Pro Gly Val Tyr Thx Ly& Val Ser Asp Pha Arg Glu Trp He Phe 

230 235 240 

Gin Ala. lie Ly£ Thr His Ser Glu Ala Ser Gly Met Val Thr Gin 

2d5 250 .255 

Lieu 

<210> 4 
<211> 225 
<212> PRT 
<213> Unknown 
<220> 

<221> EOilAIW 

<223> Serine piotesee catalytic domain of Seed 

homologous to siifdlar domain in TACG-15 , 
<400> d 

Lys He lie Ajsp Gly Ala Fro Cys Ala Arg Gly Ser tfis Pro Trp 



5 



10 15 



Gin Val Ala Leu Leu Ser Gly Asn Gin Leu His Cys Gly Gly Val 

20 25 30 

Leu Val Asn Glu Arg Trp Val Leu Thr Ala Ala Kis Cys Lys Met 

35 d0 45 



• SEQ 7/17 
SUBSTITUTE SHEET {RULE 28) 



WO 99/42120 



FCT/US99/03436 



Asn Glu Tyr Thr Val His Ltiu Gly S&r Asp Thar Leu Gly Abp Arg 

50 55 60 

Arg Ala Clxi Arg lie Lys Als Seac Lye £er Phe Arg His Fro Gly 

65 ■ . 70 75 

Tyr £>er Thr Gin Thr His val Asn Asp Leu Ifet Leu val Lya Leu 

eo 8.5 90 

Asr> Ser Gin Ala Arg Leu Ser Ser Met Val Lys Lys Val Arg Leu 

9 5 100 105 

Pro Ser Arg Cys Glu Pro Pro Gly Thx Thr Cy© Thr Val Sex Gly 

110 115 120 

Trp Gly Thr Thr Thr Set Pro A£p Val Thr Fhe Pro Ser Asp Leu 

125 130 135 

Met Cys Val Asp Val Lys Leu He Ser Fro Gin Asp Cys Thr Lys 

140 1<15 150 

Val Tyr Lye Asp Leu Leu Glu Asr> Ser Met Leu Cys Ala Gly lie 

155 160 165 

Fro Asp Ser Lys Lys Asn Ala Cys A£n Gly Asg> Ser Gly Gly Pro 

170 175 130 

Leu Val Cys Arg Gly Thx Leu Gin Gly Leu Val Ser Trp Gly Thr 

185 ISO 195 

Phe Fro Cys Gly Gin Pro Aen P.ep Pro Gly Val Tyr Thr Gin V^l 

200 205 210 

Cyj? Lys Phe Thr Lys Trp. 1 1-2 Asn Asp TJxr wet Lys Lys Kis Arg 

220 225 





215 


<210> 


.5 


<2I1> 


225 


<212> 


PHT 


<213> 


Unknown 


<220> 




<221> 


DOMAIN 


<222-> 


Serine ; 



(Try) homologous to similar domain in TADS- 15 
<400> 5 

Lys lie Val Gly Gly Tyx Asn Cys Glu Glu Asn Scar V&l Pre Tyr 



5 



10 1 



Gin Val Ser Leu Asu Ser Gly Tyr His Phe Cys Gly Gly Stir Leu 

20 25 30 



SEQ 8/17 
SUBSTITUTE SHEET [RULE 28) 



WO 99/41120 



PCT.US99/03436 



He Asn Glu Gin Trp val val Ser Ala Gly His Cys Tyr Lys Ser 



35 



/jO *5 
Arg lie Gin val Arg Leu Gly Glu His Asn He Glu Val Leu Glu 

50 55 60 

Gly Asn Glu Gin Phe lie Asn Ala Ala Lys He H« Arg His Pro 

65 70 ? 5 

Gin Tyr Asp Arg Lys Thr Leu Asn Asn Asp He Met L-eu He Lys 

bo as so 

Leu Ser Ser Arg Ala Val He A*n Ala Arg Val Ser Thr He Ser 

35 100 . 105 

Leu Pre? T"hr Ala Pro Pro Ala Thr Gly Thr Lys cys Leu 2.1* Ser 

110 115 120 

Gly Trp Gly Asn Thr Ala Ser Ser Gly Ala Asp Tyr Pro Asp Glu 

125 150 
Leu Gin Cys Leu Aes> Als Pro Val Leu Ser Gin Ala Lye Cys Glu 

140 145 ISO 

Ala Ser Tyr Fro Gly Lys He Thr ser Asn Met Flue Cys val Gly 

155 150 
Phe Lew Glu Gly Gly Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly 

170 175 ISO 

Pro Val vai Cys Asr. Gly Gin Leu Gin Gly Val Val Ser Trp Gly 

185 190 19S 

Asp Gly Cys Ala Gin Lys Asn Lys Pro Gly Val Tyx Thr Lye Val 

200 205 210 

Tyr Asn Tvr V*l Lys; Trp He Lys Aj?n Thr He Ala Ala Asn Ser 

220 225 





215 


<210> 


6 


<211> 


231 


<:212> 


FRT 


<213> 


Unknown 


^220> 




<221> 


DOHAIM 


<223> 


Serins : 




(Chymfc) 


<400> 


6 



Arg He Val Asn Gly Glu Asp Ala val Pre- Gly Ser Trp Pro Trp 

5 10 . 15 



SEQ 9/17 
SUBSTITUTE SHEET (RULE 26) 



WO 99/42120 



PCT7US99/DW3ti 



Gin Val Ser Lau Gin Asp Lye Thr Gly Fhe His Fhe Oyer Giy Gly 

20 25 30 

Ssr Leu lie Ser Glu Asp Trp Val Val Thr Ala Ala Kis Cys Gly 

35 40 4 5 

Val Arg Tlfor Ser Aap v<sl val val Ala Gly Glu Fh* Asp- Gin Gly 

50 55 60 

Sex Aep Glu Glu Asn lie Gin Val Leu Lye lie Ala Lys val Fhe 

£5 10 75 

Lya Aen Pro- Lys Phe Ser lie Leu The* Val Abu Asn Asp lie Thr 

&0 B5 3D 

L&u Lau Lye Leu Ala Thar Pre- Ala Arc Fhe S^r Gin Thr Val Ser 

3 5 100 105 

Ala Val Cys Leu Pro Ser Ala Asp Asp Asp Phe Fro Ala Gly Thr 

110 115 120 

Leu Cys Ala Thr Thr Gly Trp Gly Lye Tbx Lys Tyr Aen Ala Asn 

125 130 135 

Lys T3ir Pro Aep Lys Leu Gin Gin Al£ Ala Leu Pro Leu Leu Ser 

140 1^5 150 

Asn Ala Glu Cys Lys Lys Ser Trp Gly Arg Arg lie Thr Asp Val 

155 160 165 

Met lie Cys Ala Gly Ala sei: Gly Val S»r Ser Cys Met Gly Aep 

170 175 130 

Ser Gly Gly Pro Leu Val Cys Gin Lys Asp Gly Ala Trp Thr Leu 

JL — 1 

Val Gly lie Val Sex Trp Gly S^r- Asp Thr Cys Ser Thr £er Ser 

200 205 210 

Pro Gly Val Tyr Ala Arg Vai Thr Lys Leu lie Pro Trp Val Gin 

215 220 225 

Lys ll« Leu Ala Ala Asn 



.o= 190 195 





230 


<210> 


7 


<211> 


255 . 


<2 12 > 


PRT 


<213> 


Unknovyn 


<22D> 




<221> 


DOMAIN 


<223> 


Serin* 



fFae7) homologous to similar domain in TAEG-15 



SEQ 10/17 
SUBSTITUTE SHEET (RULE 23* 



„ PCT.OfSW;03436 
WO 99/41120 



<4D0> 



Arg lie Val Gly Gly Lys Val Cys Pro Lye Gly Glu Cys Pro Trp. 



5 



10 15 



Gin Val Leu Leu Leu Val Asn Gly Ala Gin Leu Cys Gly Gly Thx 



20 25 30 



Leu H« Aen Thx He Trp Val Val Ser Ala Ala His Cys Phe Asp 

40 45 

Lys He Lys Asn Trp Arg Asn Leu lie Ala Val Leu Gly Glu His 

50 55 
Asp Leu Ser Glu His Asp Gly Asp Glu Gin Ser Arg Arg val Ala 

65 "70 7 5 

Gin Val He lie Pro Ser Thr Tyx val Pro Gly Thr Thr Asn His 

60 . 8 5 ?0 

Asp litis Ala Leu Leu Arg Leu Hie Gin Pro Val Val Leu Thr Asp 

95 100 
Hie Vsl Val Pro Leu Cye Leu Fro Glu Arg Thr Phe Ser Glu Arg 

110 H5 120 

Thr Leu Ala Phe val Arg Fhe Ser Leu Val Ser Gly Trp Gly Gin. 



12 



5 



130 135 



Leu Leu Asp Arg Gly Ala Thr Ala Leu Glu Leu Met V^l Leu Asn 

140 145 150 

Val Pre- Arg Leu Met Thr Gin Asp Cys Ley Gin Gin Ser Arg Lys 

155 160 165 

Val Gly Asp Ser Pro Asn He Thr Glu Tyx Met Phe Cys* Ala Gly 

170 175 18° 

Tyr Sex Asp Gly Ser Lys Asp Ser Cys Lys Gly Asp Ser Gly Gly 

1S5 155 
Pro Wie Al* Thr Hie Tyr Arg Gly Thr Tip Tyx Leu Thr Qiy He 

2 00 235 210 

Vsl Ser Trp Gly Gin Gly Cys Ale Thr val Gly His Phe Gly val 

215 220 225 

Tyx Thx Arg Val Ser Gin Tyr He Glu Trp Leu Gin Lys Leu Wet 

0 235 240 



23 



Arg Ser Glu Pro Axg Fro Gly Val Leu Leu Arg Ala Pro Phe Pro 

245 250 255 



<21D> a 
<2H> 253 



SEQ 11/17 
SUBSTITUTE SHEET (RULE 2B> 



WO 99/42120 



<212> PRT 
<2 1 3 > Unknown 
<220> 

<221> DOMAIN * 

<223> Serine protease catalytic domain of tissue 

plasTfdnof/eri activator CTpa) homcliaous to 
Similar domain in TAD3-15. 
<400> B 

eg He Ly© Gly Gly I^eu Phe Ala Asp He Ala Eer Hie Pro Trp 

5 10 15 

Gin Ala Ala He Pha Ala Lys His Arg Arg Ser Pro Gly Glv Aig 



Ai: 



•> 0 25 3 0 



Phe Leu Cys Gly Gly He Leu lie Ser S*r Cy* Trp ll* Leu Ser 

j5 40 ^5 

Ala Ala His cys Phe Gin Glu Arg Phe Fro Pro Hie His Leu Thr 

50 55 50 

Val He Leu Gly Arg Thr Tyr Arg Val Val Pxo Gly Glu Glu Glu 

65 70 75 

Gin Lys Phe Glu Val Glu Lys Tyr He val His Lys Glu Phe Asp 

SO B5 90 

Asp ASp Thr Tyx Asp Agn A3P He Ala Leu Leu Gin Lfcu Lys Ser 

95 10.0 105 

Asp Ser Ser- Arg Cys Ala Gin Glu Ser Ser val val Arg Thr val 

110 H5 120 

C>*jb Leu Pro Pro Ala Asp Leu Gin Leu pro Asp Trp Thr Glu Cys 

125 130 '135 

Glu Leu Sex Gly Tyr Gly Lys His Glu Ala Leu Ser Pxo Phe Tyr 

140 145 150 

Sec Glu Arc Leu Lyg Glu Ala Hig Val Arg Leu Tyr fro Sex Ser 

155 160 165 

Arg Cys Thr Ser Gin His Leu L*u Asn Arg Thr val Thr Asp Asn 

170 i75 130 

Met Leu Cys Ala Gly Asp Thr Ax 3 Ser Gly Gly Pro Gin Ale Asn 

135 150 1^5 

L&u Kie Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro Leu Val Cys 

200 205 210 



SEQ 12/17 
SUBSTITUTE SHEET (RULE 26) 



WO 99/42320 



FCT/US9WWOfi 



Leu Asn Asp Gly Arg Met Thr Leu Val Giy lie He Ser Trp Gly 

215 220 225 

Leu Gly Cys Gly Gin Lys Asp val Pro Gly Val Tyr Thr Lye Val 

230 23.5 240 

Tiir Asn Tyr Leu Asp Trp He Axg Asp Asn Met Arg Pro 

3d5 230 



9 



50 



<21Q> 

<211> 2?0D 
<212> EMfc 
<213> Komo -eapi-s^s 

<220> 

<223> SNCl 5 ' thRNA eeqsuence (U2042S) 

cgctgggtcrg tgctggcagc cgtgctgatc ggcctcctct tggccttgefc 

ggggategg* ttcctggtgt ggcatttgca gtaecgggac gtgcgtgtcc 100 

agaaggtett caatggctac atgaggatca caaatg^gaa ttttgtggat 150 

gcctacgaga actccaactc cactgagttt gtaagcctgg ccagcaaggt 2 00 

gaaggacgcg ctgaagctg<= tgta^agcgg agtcccatxc ctgggcccct 250 

accacaagga gfccggcrtgtg acggccttcs gcgagggeag cgtaategec 30D 

tactaetggt ctgagttcag catcccgcag c-aeetggttg aggsggccga 350 

gtgcgtcatg gccaggagcg cgtagtcatg ctgcccccgc g«ecgcgctc 400 

cctgaagtcc tttgtggtea cctcagtggt ggctttcccc acggactcsea 450 

aaacagtaca gaggacccag gacascagct gcagrstttgg csrtgcacgcc 500 

gcgg tgtgga gctgacgcgc ttc&ccaegc cggcttccct gacagcccct 550 

arc'ccgctca tgeccgetge cagtgggctg cggggacgcg acgcagtgct 600 

gsgctacteg agctgactcg cagcttgact gcgcctcgac gagcgcggca 650 

gcgaeetggt gacgtgtaca acaccctg-jig ccccatggag ccccacgcct 7 DO 

ggfcgagtgcg tggcracctac cctccctcct acaaectgac sttccactcc 750 

ctcccacgaa cgtcctgctc atcacactga t&accaacac tgacgcggc© 800 

tcccggettt gaggccacct tcttcrcragct c-cctaggatg agcagctgtg 850 

gaggccgctt acgta^agecr caggggacat tcaacagccc ctaetaecca 900 

ggcccictacc eacccaacat tgactgcaca tggaaaattg aggtgcccaa 550 

caaccsgcat gfcgaaggtgc gcttcaaatt cttctacctg ctggagcccg 1000 

gcgtgcctgc gggcacctgc ceeaaggiet acgtggagat caatggggag 1050 

aa&tactgcg gagagaggtc? ecagttcgtc gt cacaagea . acagcaacaa 1100 

gatcacagtt cgcttccact cagaccagtc? ctacaccgac accggcetct 1150 

tagctgaata <zcvct.cc tac gactecagtg acccatgccc ggggeagttc 1200 

acgfcgccgcs cggggcggtg tatccggaag gagctgcgct gtgatggctg 12 50 



SEQ 13/17 
SUBSTITUTE SHEET {RULE 28) 



WO 99/42110 



PCT/US9?.'"034M 



ggcgac tgca ccgaccacag cgatgagctc 

ocaecagtte atgtgeasg.2 gcaagttctg 

acagtgtgaa cgagtgcgga gacaac&gcg 

ccggacceag acctxcaggt gttccaatgg 

.ag.cagtgcaa tgggaaggac gactg tgggg 

tgccccaagg fcgaacgtcgt cacttgtacc 

caatgggctc tgctxgagca agggcaaccc 

actgtagcga cggctcagat gag-s^ggae t 

tfccacigagac aggctcgtgt tgttgggggc 

gtggccctgg caggtaagcc tgcatgctct 

gtgctteect catctetoec aactggctgg 

atcgatgaca gaggattcag gtactcagac; 

tgggcttgca cgaccagagc cagcgcaggc 

gcccaagcgc aLcatct-ccc aocccttctt 

atgacategc gctgctggag ctggag^a&c 

gtgcggec-sa tctgccrtgcc ggaegcctgc 

ggccatctgg gtcacgggct ggggacacac 

cgctgatect gcaaaagggt gagatcegcg 

gagaaeetcc tgccgcagca gatcacgccg 

cctcagcggc ggcgtggact ccogcceggg 

ccsgcgtgga ggsggat ggg cggatettcc 

ggagacgctg cgctcagagg aacsagccag 

etgtttcggg aatggatcaa agagaacact 

acccaaatgt gtacacctgc ggggcoa-scc 

cgcctgcag-g ctggagaetc gegeaccgtg 

acatacactg cgaact-catc tccaggrctca 

gcttcctcag cctccaaagt ggagetggga 

tggtggtfcct actgacccaa ctggggcaag 

ageceaagtg ggcgaggacg cgtttgtgca 

ggaagacccg gatctctagt gagtgtg&et 

t-tggccaccrc fctcttgagga agce sag-get 

acgggtctga. gactgaaaat ggtttaccag 
gtgtattgtg taaatgagta aascatttta 

<210> 10 

<211> 20 

<212> DMA 

<213> Artificial Se-quero 
<220> 



. »^ n lm 

actCE-gcsgcc 








t. ggg ticc g-_ g 




acgagcaggg 


t tgc ate cgc 


J.sJ UU 


gaagtgcctc 


tcgaciLtapo o 




acgggtccgs 


cgaggc-c tec 




aaac acacct 


^» mm J*V 

^ccgctgcvt 


a v U 


Mh ^Mj. MB _MMI fen MM 

tgagtgcgac 


ggg<xct-jgagg 


i Ct n 
.* t3 U U 


g-rrgacftgfcgg 


gc tgogg tea 




acggatgegg 


a t gagg g ga 


1 > WW 


gggccagggc 


cacci tc t gcu 




tci t c tgccgc : 




1 OA ft 


cccacgcagg 


acg^icc t-t.ee 


l o y 


cctggggt-gc 


agga^c gcag 


X y y v 


c&a t gaerno 


*m _a ^» ^» 

ficc c gac c 


X !r w-VJ 


cggc agagta. 


c<agc ca eg 


^ H Art- 


ca tgtctfccc 


^ 1 ^4 pV *i •"j 

c tgccggusiei 




ccegt-atgga 


cgcac t ggcg 


^ X U v 




g^ccac ct-gc 


^-Xzjv 


cgcatgatgt 


gcg tgggct t 




Mh K MBh MB> B* A BMfc 

c^stt-ccggg 


g^acc^ c eg c 


•*> ^ ^ Al 


agg-seggtgt 


ggtgagccgy 


•> "3. ii rt 

*i J? VJ V 


gcgtgt;acac 






g^rgtatagg 


cs q ccg gg g ^ 


^. -S r J U 


atcgtccacc 


ccs 9 tgt gca 


w 4 0 V 


acctgcacca 


gcgccc caga 




aatccgctag 


aaaacctctc 


2550 


gggtagaagg 


ggaggaacac 


2600 


gttt-gaagca 


cagct-scggc 


26S0 


cactgccctg 


ctctatacac 


2700 


g-ssggatetg 


gctgcggtcc 


2750 


cggaggaccc 


tgga^aacag 


2300 


ctcccaggtg 


acttcagtgt 


2850 


tttcttttca 


©•Ra^cictctaaa 


2900 



SEQ 14/17 
SUBSTITUTE SHEET (HULE 26) 



WO 9W42120 



FCT/U599/03436 



<22l> primer 

<223> Forward prinvsr for analysis of over^^pxessisn 

of TAD3-15 mKMA. by cjuantita-tive PCR. 
<400> 10 

atgacaga^g attcaggtic 20 

<210> 11 

<2I1> 20 

<212> DMA 

<213> Artificial Sequence 
<220> 

<221> primer 

<223> Reverse primer for tmaly^i* of overer-iprs^ion 

.Of TADG-15 ffiRI'JA by quantitative FCR. 
<400> 11 
ffaag*jtgaag t^attgaaga 2 0 



<2 



10> 12 



<211> 17 

<212> EHA 

<213> Artificial Sequence 
<220> 

<22 1> primer 

3> Forward primer for analysis of (J- tubulin iriRKA 

escpre^sion by quantitative PCR. 
<400> 12 

. tgcaetgaca acgaggc 17 

<210> 13 ' 

<211> 17 

<212> DMA 

<213> Artificial Sequence 
<220> 

<221> primer 

<223> _ Reverse primer for analysis of .-tubulin ipRN^ 

expression, by quantitative PCR, 
<dQ0> 12 

ctgtcttgae attgfccg 17 



SEQ 15/17 
SUBSTITUTE SHEET (RULE 2«) 



WO 9W42I20 



FCT;US9E>,<U3436' 



<210> 14 

<211> 242 

<212> FRT 

<213> Homo sapiens 

<220> 

<221> DOMAIN 

<22Z> Serine protease n^talytic docn&ij^ ot TAKf-15. 

<400> 14 

Arc? Val Val Gly Gly Thr Asp Ala Asp Glu Gly Glu Trp Pro Trp 

5 .10 15 

Gin Val Ser Leu His Ala Leu Gly Gin Gly His lie Cys Gly Al*. 

20 25 30 

Ser Leu lie Ser Pro Aen Trp Leu Val Ser Ala Ala His Cys Tyr 

35 40 45 

Tie Abp Asp Arg Gly Pha Arg Tyr Ser Asp Pro Thr Gin Trp Thr 

50 55 60 

Ala Fh* Leu Gly Leu His Asp Gin S«r Gin Arg Ser Ala Pro Gly 



55 70 



Val Glri Glu Arg Arg Leu Lys Arg lie lie ser His Pro Ph* Phe 



SO 65 30 



Asn Asp Phe Thx Pne Asp Tyr Asp He Ala Leu Lexs Glu Leu Glu 

95 200 105 

Lys Pro Ala Glu Tyr Sex £er Met Val Argr Pro lis cys Leu Pro 

110 . 115 120 

Asp Ala Ser His Val Phe Pre Ala Gly Lys Ala He Trp Val Thr 

125 130 13 5 

Gly Trp Gly His Thr Gin Tyr Gly Gly Tnr Gly Ala Leu lie Leu 

140 145 150 

Gin Lys Gly Glu He Arg Val lie Asn Gin Thr Thr Cys Glu Asn 

• 155 160 265 

Lau Leu Pro Gin Gin He Thr Pro Arg Met Met Cys val Gly Phe 

170 - 175 ISO 

Leu Ser Gly Gly Val Asp Ser Cys Gin Gly Asp Sex Gly Gly Pro 

185' 190 195 

Leu Ser Ser val Glu Ala Aep Gly Arg He Phe Gin Ala Gly Vai 

200 205 210 

Val S«r Trp- Gly Asp Gly Cys Ala Gin Ax9 Asn Lys Pro Gly Val 

215 220 225 



SBQ 16/17 
SUBSTITUTE SHEET {RULE 281 



WO 99/42120 PCT/Ui??9. , D3436 



Tyr Thr Arg Leu Fro J,a\i Pile Arg Agp Txp He Lys Glu Asn Thr 

230 235 2-40 

Glv val 



SEQ 17/17 
SUBSTITUTE SHEET (RULE 26) 



